<<

Multivariable and Vector

Lecture Notes for MATH 0200 (Spring 2015)

Frederick Tsz-Ho Fong Department of Brown University

Contents

1 Three-Dimensional ...... 5 1.1 Rectangular Coordinates in R3 5 1.2 Dot Product7 1.3 Cross Product9 1.4 Lines and Planes 11 1.5 Parametric 13

2 Partial Differentiations ...... 19 2.1 Functions of Several Variables 19 2.2 Partial 22 2.3 26 2.4 Directional Derivatives 30 2.5 Planes 34 2.6 Local Extrema 36 2.7 Lagrange’s Multiplier 41 2.8 Optimizations 46

3 Multiple Integrations ...... 49 3.1 Double in Rectangular Coordinates 49 3.2 Fubini’s Theorem for General Regions 53 3.3 Double Integrals in Polar Coordinates 57 3.4 Triple Integrals in Rectangular Coordinates 62 3.5 Triple Integrals in Cylindrical Coordinates 67 3.6 Triple Integrals in Spherical Coordinates 70 4 ...... 75 4.1 Vector Fields on R2 and R3 75 4.2 Integrals of Vector Fields 83 4.3 Conservative Vector Fields 88 4.4 Green’s Theorem 98 4.5 Parametric Surfaces 105 4.6 Stokes’ Theorem 120 4.7 Theorem 127

5 Topics in and ...... 133 5.1 Coulomb’s Law 133 5.2 Introduction to Maxwell’s 137 5.3 Heat Diffusion 141 5.4 Dirac Functions 144 1 — Three-Dimensional Space

1.1 Rectangular Coordinates in R3 Throughout the course, we will use an ordered triple (x, y, z) to represent a point in the three dimensional space. The real numbers x, y and z in an ordered triple (x, y, z) are respectively the x-, y- and z-coordinates which, by convention, are defined according to the following diagram:

z

c (a, b, c)

b a y (a, b, 0) x

Figure 1.1: Rectangular coordinates system in 3-space

Notation We will use the notation R3 to denote the entire three dimensional space.

Any point on the x-axis has the form (x, 0, 0), i.e. y = 0 and z = 0. Similarly, points on the y-axis are of the form (0, y, 0), and points on the z-axis are of the form (0, 0, z). The three coordinate axes meet at a point with coordinates (0, 0, 0) which is called the origin. A vector in R3 is an arrow which is based at one point and is pointing at another point. If a vector v is based at (x0, y0, z0) and points toward (x1, y1, z1), then the vector is written as:

v = (x1 − x0)i + (y1 − y0)j + (z1 − z0)k.

For example, the vector based at (3, 2, −1) pointing at (5, 2, 0) is expressed as

(5 − 3)i + (2 − 2)j + (0 − (−1))k = 2i + k. 6 Three-Dimensional Space

i Consequently, any two vectors that are pointing at the same direction and have the same length are considered to be equal, even though they may have different base points. For instance, a vector v based at (1, 2, 3) pointing at (4, 3, 2), i.e.

v = (4 − 1)i + (3 − 2)j + (2 − 3)k = 3i + j − k,

is considered to be equal to the vector w based at (0, 0, 0) pointing at (3, 1, −1). In other words, we can write w = v.

An alternative notation for a vector is the angle bracket ha, b, ci (to save the hassle of writing down i, j and k:

Notation ha, b, ci = ai + bj + ck.

In this course, we make very little conceptual distinction between a point (x, y, z) and a vector based at (0, 0, 0) pointing at the point (x, y, z). However, speaking of notations, one should use hx, y, zi or xi + yj + zk to denote a vector and (x, y, z) to denote a point so as to avoid confusion. Vector additions can multiplications are defined as follows:

Definition 1.1 — Vector Additions and Scalar Multiplications. Let a = ha1, a2, a3i and b = 3 hb1, b2, b3i be two vectors in R , and c be a real scalar, then:

a + b = ha1 + b1, a2 + b2, a3 + b3i (vector addition) ca = hca1, ca2, ca3i ()

The negative of a vector is defined as: −a = (−1)a. The difference between vectors is defined as a − b = a + (−b).

Geometrically, these vector operations can be represented by the following diagrams:

a a + b b b a a − b

Figure 1.2: Geometric representations of various vector operations

Property Vector additions and scalar multiplications have the following algebraic properties 1. commutative rule: a + b = b + a 2. associatative rule: (a + b) + c = a + (b + c) 3. distributive rules: (λ + µ)a = λa + µa and λ(a + b) = λa + λb 1.2 7 1.2 Dot Product There are two types of products for vectors in R3, namely the dot product and the . The former outputs a scalar whereas the latter outputs a vector. In this section, We first talk about the dot product.

Definition 1.2 — Dot Product. Let a = ha1, a2, a3i and b = hb1, b2, b3i, then dot product between the vectors a and b are defined as:

a · b = a1b1 + a2b2 + a3b3.

It is important to note that the dot product of a vector a = ha1, a2, a3i by itself is given by: 2 2 2 a · a = a1 + a2 + a3 which is incidently the square of the length of the vector a (by the Pythagoreas’ Theorem in R3).

Notation Let a = ha1, a2, a3i. We denote the length of a vector a by |a|, which is given by: q 2 2 2 |a| = a1 + a2 + a3.

It is important to note that a · a = |a|2.

The length of a vector, i.e. |a|, is also called the , or the , of the vector. Property It can easily be verified that the dot product satisfies the following algebraic properties: 1. a · b = b · a. 2. (a + b) · c = a · c + b · c. 3. (λa) · b = λ(a · b) 4. 0 · a = a · 0 = 0.

The following theorem gives the geometric meaning of the dot product: 3 Theorem 1.1 Let a = ha1, a2, a3i and b = hb1, b2, b3i be two vectors in R , and θ be the angle between these two vectors. Then we have:

a · b = |a| |b| cos θ. (1.1)

Proof. The proof uses the Law of Cosines. Consider the triangle in the diagram below:

b a − b

θ a

The side opposite to the angle is represented by the vector a − b. Using the Law of Cosines:

|a − b|2 = |a|2 + |b|2 − 2 |a| |b| cos θ (a − b) · (a − b) = a · a + b · b − 2 |a| |b| cos θ a · a − a · b − b · a + b · b = a · a + b · b − 2 |a| |b| cos θ X X a ·a − 2a · b + b ·XXb = a ·a + b ·XXb − 2 |a| |b| cos θ −2a · b = −2 |a| |b| cos θ a · b = |a| |b| cos θ.

 8 Three-Dimensional Space

One immediate consequence of (1.1) is that it allows us to use the dot product to find the angle between two vectors. Precisely, the angle θ between two vectors a and b is given by:  a · b  θ = cos−1 . |a| |b| The most important case is that the two vectors a and b are perpendicular, also known π as orthogonal. The angle between the vectors is 2 and so we have the following important fact: Corollary 1.2 Two non-zero vectors a and b are orthogonal if and only if a · b = 0.

This corollary is particularly useful to determine whether two vectors are perpendicular.

 Example 1.1 Show that any triangle which is inscribed in a and has one of its side coincides the diameter of the circle must be a right-angled triangle.

 Solution Let O be the center of the circle. Define vectors a and b as in the diagram below.

b

−a O a

We would like to show the vectors in red and blue are orthogonal to each other. By basic vector additions and subtractions:

Red vector = −a − b Blue vector = a − b.

Their dot product equals to

(−a − b) · (a − b) = −a · a + a ·b − b ·a + b · b = − |a|2 + |b|2 (recall v · v = |v|2)

Since both a and b represent the radii of the circle, they have the same magnitude. Therefore |a| = |b| and we have (−a − b) · (a − b) = 0. This shows the red and blue vectors are orthogonal. 1.3 Cross Product 9 1.3 Cross Product The cross product is another important vector operations. In contrast to the dot product, the cross product outputs a vector instead of a scalar. A vector is characterized by its length and direction, we define the cross product by declaring these two attributes:

Definition 1.3 — Cross Product. Given two vectors a = a1i + a2j + a3k and b = b1i + b2j + b3k in R3 with angle θ between them, the cross product between a and b, denoted by a × b, is defined as the vector such that: • the length is given by |a × b| = |a| |b| sin θ, i.e. the of the parallelogram formed by vectors a and b; • the cross product a × b is orthogonal to both a and b; • the direction of a × b is determined by the right-hand grab rule illustrated by the Figure 1.3.

Figure 1.3: right-hand grab rule

From the right-hand grab rule, we can clearly see that a × b and b × a are vectors with the same length but in opposite direction, i.e. a × b = −b × a. The magnitude of |a × b|, which is defined to be |a| |b| sin θ, is the area of the parallelo- gram formed by a and b:

b |b| sin θ

θ a

The following are algebraic properties of the cross products which are all useful. Based on the definition of cross products we presented above, the proofs are purely geometric and are omitted here. Property The cross product satisfies: 1. a × b = −b × a 2. (a + b) × c = a × c + b × c 3. a × 0 = 0 4. a × a = 0 10 Three-Dimensional Space

For simple vectors such as i, j and k, their cross products can be easily found from the definition: i × j = k, j × k = i, k × i = j. For more complicated vectors, the cross product can be computed using the following formula:

Theorem 1.3 — Determinant Formula of Cross Product. Given two vectors a = a1i + a2j + a3k and b = b1i + b2j + b3k, their cross product is given by:

i j k

a × b = a a a (1.2) 1 2 3 b1 b2 b3

= (a2b3 − a3b2)i − (a1b3 − a3b1)j + (a1b2 − a2b1)k.

Proof. The proof follows from expanding:

a × b = (a1i + a2j + a3k) × (b1i + b2j + b3k) using the algebraic properties of the cross product. It is left as an exercise for readers.  The cross product will be extremely useful for us to find a vector which is orthogonal to a .

 Example 1.2 Given three points in the xyz-space:

A(0, 2, −1), B(4, 0, −1), C(7, −3, 0)

Find a vector n which is orthogonal to the plane passing through A, B and C. Moreover, find the area of the triangle 4ABC.

 Solution A vector n is orthogonal to the plane if and only if it is orthogonal to any two (non-parallel) vectors on the plane. We will first find two vectors on the plane and then take the cross product. The outcome will give a vector orthogonal to the two vectors, and hence to the plane. The following two vectors lie on the plane: −→ AB = h4, 0, −1i − h0, 2, −1i = h4, −2, 0i −→ AC = h7, −3, 0i − h0, 2, −1i = h7, −5, 1i −→ −→ Taking the cross product: AB × AC = h−2, −4, −6i. Therefore, the required vector n can be taken to be any scalar multiple of h−2, −4, −6i, such as h1, 2, 3i or h2, 4, 6i. −→ −→ The length of the cross product AB × AC is equal to the area of the parallelogram formed −→ −→ 1 by AB and AC. The area of the triangle 4ABC is 2 of the area of this parallelogram. Therefore, 1 −→ −→ 1 q √ Area of 4ABC = AB × BC = (−2)2 + (−4)2 + (−6)2 = 14. 2 2 1.4 Lines and Planes 11 1.4 Lines and Planes 1.4.1 Parametric Equations of Lines In the three dimensional space, lines are no longer represented by an equation like x + 2y = 1 in the two dimensional plane. In order to represent a straight-line (and a as well), we need to introduce the time t, and think of a straight-line or a curve as the path of a particle travelling as t varies. Suppose the line L is through the point P (x , y , z ) and is parallel to the vector v = 0 0 0 −→0 hv , v , v i. For any variable point P(x, y, z), the vector P P is parallel to the vector v, meaning 1 −→2 3 0 that P0P = tv for some t. Therefore,

hx, y, zi − hx0, y0, z0i = tv hx, y, zi = hx0, y0, z0i + t hv1, v2, v3i

Therefore, we have:

x = x0 + tv1 y = y0 + tv2 z = z0 + tv3

which is called the parametric equation of the line L. It is called this way because the variable t is called the of the line. While intuitively t can be thought as the time, it needs not always be.

Notation In this course, the vector r is ‘reserved’ to denote the vector hx, y, zi.

Using this notation, we can also write the parametric equation of the line L in vector form: −−→ r(t) = OP0 + tv.

The t-variable in r(t) emphasizes the fact that the position vector r depends on t. It can be omitted if it is clear that t is the parameter letter. To summarize, in order to write down the parametric equation of a straight-line, we need two “ingredients": 1. a given point P0 on the line, and 2. the direction v of the line.

1.4.2 Equation of Planes In three , equation of a plane can be represented in the form of Ax + By + Cz = D, as compared to Ax + By = C for straight-lines in two dimensions. For a plane through a given point P0(x0, y0, z0) with a vector n = hA, B, Ci, the equation of the plane is given by:

Ax + By + Cz = Ax0 + By0 + Cz0. (1.3)

Equation (1.3) can be proved by considering a variable point P(x, y, z). As illustrated in −→ Figure 1.4, the vector P0P lies on the plane and therefore is orthogonal to the normal vector n. Therefore, we have: −→ n · P0P = 0 hA, B, Ci · (hx, y, zi − hx0, y0, z0i) = 0 hA, B, Ci · hx, y, zi − hA, B, Ci · hx0, y0, z0i = 0 Ax + By + Cz = Ax0 + By0 + Cz0. 12 Three-Dimensional Space

n

P0 P

Figure 1.4: equation of a plane

 Example 1.3 Determine whether the following points are co-planar, i.e. contained in a single plane:

A(0, 2, −1), B(4, 0, −1), C(7, −3, 0), D(1/3, 1/6, 1/9).

 Solution Since any given three points determine a plane, we will first find an equation of the plane through points A, B and C, then we will check whether D lies on the plane by substituting its coordinates into the equation found. The two “ingredients" of finding the equation of a plane are 1. a given point on the plane; and 2. a normal vector to the plane. In order to find the normal vector the plane through A, B and C, we take the cross product −→ −→ of AB and AC. Incidently, these three points are exactly the same as in Example 1.2, in which we have worked out that a normal vector to the plane through A, B and C is given by n = h1, 2, 3i. Take A(0, 2, −1) to be the given point, then the equation of the plane through A, B and C is given by: 1x + 2y + 3z = (1) + 2(2) + 3(−1). After simplification: x + 2y + 3z = 1. Substitute the coordinates of D into the equation:

1 1 1 LHS = + 2 · + 3 · = 1 = RHS. 3 6 9 Therefore, D lies on the plane through A, B and C. In other words, these four points are co-planar.

i Given a plane Ax + By + Cz = D, the vanishing of some of the A, B, C and D has the following geometric significance (assuming A, B and C are not all zero in all cases): When D = 0, the plane passes through the origin (0, 0, 0). • When C = 0, the normal vector is of the form hA, B, 0i and hence is horizontal. In • this case, the plane vertical. When A = B = 0, the normal vector is of the form h0, 0, Ci and hence is vertical. In • this case, the plane is horizontal. 1.5 Parametric Curves 13 1.5 Parametric Curves 1.5.1 Parametric Equations of Curves In two dimensions, there are two ways to represent a curve, namely in the form of x2 + y2 = 1, or of the form

x = cos t y = sin t.

The former is called the Cartesian equation and the second one is called the parametric equation. However, in three dimensions, a single Cartesian equation such as x2 + y2 + z2 = 1 represents a instead. Therefore, we will only use the parametric equations to present curves in three dimensions. Definition 1.4 — Parametric Equation of a Curve. The parametric equation of a curve is of the form:

x = f (t) y = g(t) z = h(t)

where f (t), g(t) and h(t) are differentiable functions of t. In vector notations, the parametric equation of this curve is written as:

r(t) = f (t)i + g(t)j + h(t)k.

The parametric equation of a straight-line is a special case of parametric equation of a “curve". An interesting example of a parametric curve is the :

t r (t) = (cos t)i + (sin t)j + k. 1 20 It is a curve that goes around the circle but the altitude is constantly increasing. See Figure 1.5a for the computer sketch. Here is another example of a parametric curve. See Figure 1.5b for the sketch.

r2(t) = (sin t)i + (cos t)j + (sin 2t)k.

1.0 0.5 0.0 -0.5 -1.0

1.5 1.0

0.5 1.0

0.0 0.5 1.0 -0.5 0.5 0.0 -1.0 --1.01.0 0.0 -0.5 -0.5 0.0 0.0 -0.5 0.5 0.5 -1.0 1.0 1.0

(a) sketch of r1(t), a helix (b) sketch of r2(t)

Figure 1.5: Sketches of two parametric curves 14 Three-Dimensional Space

1.5.2 Derivatives of Parametric Curves One reason for using vector notations for a parametric curve is that their derivatives carry various physical and geometric meanings. Given a parametric curve r(t) = f (t)i + g(t)j + h(t)k, regarded as the position vector of a particle at time t, then: • the first , r0(t) = f 0(t)i + g0(t)j + h0(t)k, represents the vector of the particle, • geometrically, the first derivative r0(t) (if non-zero) is a tangent vector of the curve, p • the length |r0(t)| = ( f 0(t))2 + (g0(t))2 + (h0(t))2 represents the speed of the particle, and • the , r00(t) = f 00(t)i + g00(t)j + h00(t)k, represents the vector of the particle.

Conservation of In physics, given a particle with mass m travelling along the r(t), the following vector is defined to be the angular momentum of the particle:

L(t) = r(t) × mr0(t).

When L(t) is a non-zero constant vector (independent of t), we say that the angular momentum is conserved. The conservation of angular momentum implies that the path of the particle is contained in a plane. It can be explained as follows: By the definition of cross product, the angular momentum L(t) is always orthogonal to r(t) (and to r0(t) too, but we do not need this). Therefore, at any time t, we have:

L(t) · r(t) = 0.

Let r(t) = x(t)i + y(t)j + z(t)k. If L(t) is a constant vector, it can be expressed as L = Ai + Bj + Ck where A, B and C are fixed numbers. Then:

(Ai + Bj + Ck) · (x(t)i + y(t)j + z(t)k) = 0 Ax(t) + By(t) + Cz(t) = 0.

Therefore, the point (x(t), y(t), z(t)) lies on the plane Ax + By + Cz = 0, which is plane with normal vector L and through the origin. In other words, the path of the particle is confined in this plane.

Product Rules of Differentiating Curves When differentiating the dot or cross product of two curves u(t) and v(t), you may apply the as like in single variable calculus.

Property Given two curves u(t) and v(t), and a scalar f (t), we have: d 0 0 1. dt ( f (t)u(t)) = f (t)u(t) + f (t)u (t) d 0 0 2. dt (u(t) · v(t)) = u (t) · v(t) + u(t) · v (t) d 0 0 3. dt (u(t) × v(t)) = u (t) × v(t) + u(t) × v (t).

Here is a good example on the use of one of the above product rules.

 Example 1.4 Given r(t) represents a particle travelling at uniform speed C. Show that its velocity and acceleration vectors are always orthogonal.

0  Solution The particle is travelling at uniform speed C. Therefore, |r (t)| ≡ C. We want to show that r0(t) · r00(t) ≡ 0, so it is natural to differentiate |r0(t)| with respect to t so that the RHS vanishes and the LHS perhaps may be related to r00(t). However, since |r0(t)| is the form of a square root so it is cumbersome to differentiate it. 1.5 Parametric Curves 15

Instead, we differentiate |r0(t)|2 = C2 using the fact that |r0(t)|2 = r0(t) · r0(t)):

0 2 r (t) = C2 r0(t) · r0(t) = C2 d r0(t) · r0(t) = 0 dt r00(t) · r0(t) + r0(t) · r00(t) = 0 2r0(t) · r00(t) = 0.

Therefore, we have r0(t) · r00(t) = 0, which is desired.

1.5.3 Arc Lengths of Curves For a particle travelling at uniform speed, the distance (i.e. ) travelled is simply: distance = speed × time lapsed As compared to the area of a rectangle: height × width. However, if one is asked to calculate the area under a curve y = f (x), a ≤ x ≤ b, one should consider the

b b ydx = f (x)dx ˆa ˆa

b under the rationale of Riemann sum: a ydx ≈ ∑i yi∆xi. At the same token, if the particle is´ not travelling at uniform speed, one should calculate the distance by integration:

distance = (speed) × d(time), ˆ

as compared to area = height × d(width). Precisely, we have: ´ Theorem 1.4 Given a parametric curve r(t). The arc length of the curve from the point at t = a to the point at t = b is given by:

b arc length = |r0(t)|dt. (1.4) ˆa

 Example 1.5 Find the arc length of the curve: √ 1 2 2 2 3 r(t) = t i + t 2 j + tk 2 3

8  from (0, 0, 0) to 2, 3 , 2 .

 Solution It is simple to verify that the initial point corresponds to t = 0 since r(0) = h0, 0, 0i, 8 whereas the final point corresponds to t = 2 since r(2) = 2, 3 , 2 . √ 0 D 1 E r (t) = t, 2t 2 , 1

0 p r (t) = t2 + 2t + 1 q = (t + 1)2 = t + 1 (note that t + 1 > 0 in our case) 16 Three-Dimensional Space

The desired arc length is given by:

2 2 0 r (t) dt = (t + 1)dt ˆ0 ˆ0 2 t2 = + t 2 0 = 10.

1.5.4 Arc-Length Parametrization Let’s begin our discussion by considering the three curves:

r1(t) = (cos t)i + (sin t)j + tk, 0 ≤ t ≤ 2π r2(t) = (cos 2t)i + (sin 2t)j + 2tk, 0 ≤ t ≤ π

If you plot them using Mathematica, you should find out that these two curves are the same, although their speeds are different. The curve r2(t) is obtained by replacing every t in r1(t) by 2t. The initial and final times are adjusted so that the end-points of both r1 and r2 are (0, 0, 0) and (0, 0, 2π). We say r2 is a reparametrization of r1. If r(s) is a parametric curve such that |r0(s)| = 1 for any s, we say the curve is parametrized by arc-length. For such a parametrization, it is conventional to use s to denote the parameter. Given a parametric curve r(t), in theory one can reparametrize the curve by arc-length, such that with the new parameter s, the curve r(s) travels at unit speed. To find the arc-length parametrization, you may follow the procedure: 1. Given a curve r(t) : [a, b] → R3, compute the following integral:

t 0 s = r (τ) dτ. ˆa 2. Since the upper of the above integral is t, the function s should be a function of t. Express t in terms of s whenever it is possible, so that t is a function of s, i.e. t = t(s). 3. Finally, replace all t’s by this function of s in the curve r(t). The new parametrization r(s) will be arc-length parametrized. Let’s see some examples before we learn why it works:

 Example 1.6 Find the arc-length parametrization of the curve:

r(t) = (cos t)i + (sin t)j + tk, t ∈ [0, 2π].

 Solution By straight-forward computations, we get √ 0 r (t) = 2.

Therefore, t t √ √ 0 s(t) = r (τ) dτ = 2dτ = 2t. ˆ0 ˆ0 Express t in terms of s, we get t = √s . Replace all t’s in r(t) by √s , we get an arc-length 2 2 parametrization:  s   s  s r(s) = cos √ i + sin √ j + √ k. 2 2 2 1.5 Parametric Curves 17

 Example 1.7 Find the arc-length parametrization of the curve: √ 1 2 2 2 3 r(t) = t i + t 2 j + tk, t ≥ 0. 2 3

 Solution By straight-forward computations (done before), we get:

0 r (t) = t + 1.

Consider: t t 2 0 t s = r (τ) dτ = (τ + 1)dτ = + t. ˆ0 ˆ0 2 To solve t in terms of s, we use the quadratic equation. One should get: √ −2 + 4 + 8s √ t = = −1 + 1 + 2s. 2 Finally, replace all t’s in r(t) by this function of s, we get an arc-length parametrization: √ 1  √ 2 2 2  √ 3/2  √  r(s) = −1 + 1 + 2s i + −1 + 1 + 2s j + −1 + 1 + 2s k. 2 3

To see why this procedure gives an arc-length parametrization, we need to show |r0(s)| = 1. We use the Fundamental Theorem of Calculus and the chain rule:

0 dr dr dt r (s) = = ds dt ds 0 1 = (t) r ds . dt Recall that s is defined to be t 0 s = r (τ) dτ. ˆa The Fundamental Theorem of Calculus tells us that:

ds 0 = r (t) dt and so, 0 0 1 r (s) = r (t) · = 1. |r0(t)| Although the above procedure of finding arc-length parametrization works in the two examples we have seen, in general it may be hard to find an arc-length parametrization. Since both steps – integration and solving t in terms of s – can be difficult if the given curve r(t) is not nice. The arc-length parametrization plays a significant role in finding and torsion of a curve. Interested readers may consult Briggs-Cochran-Gillett’s book Section 12.9 for detail. In this course, the arc-length parametrizations will come up later in the chapters.

2 — Partial Differentiations

2.1 Functions of Several Variables As the name implies, a function of several variables, or a multivariable function, is one that not only depends on one but several quantities. Examples of which include:

of a cylinder = πr2h

where r is the radius of the cylinder and h is the height. Symbolically, we denote the function for the volume of a cylinder by V(r, h), which indicates V depends on both r and h. We can write:

V(r, h) = πr2h.

In this chapter, we will extend theory and applications of single-variable differentiation to multivariable differentiation. Multivariable integration will be discussed in the next chapter.

2.1.1 Graphs of Two-Variable Functions In single-variable calculus, we visualize a function y = f (x) through its graph. The horizontal x-axis stands for the inputs, and the height of the graph above x represents the output f (x). Many concepts in single-variable calculus, such as derivatives, integrals, critical points, etc. are introduced using the . For functions of two variables, i.e. f (x, y), the graph is no longer a curve in R2, but a surface in R3. The inputs involve two variables x and y, and are represented by points on the xy-plane. The value of the function f (x, y) is now represented by the height z of the surface above the point (x, y). 20 13.2 Graphs and LevelPartial Curves Differentiations 897

z z Function f assigns to each point (x, y) in the domain a real number z ϭ f (x, y).

(x, y, f (x, y)) (x, y, f (x, y))

Graph of f

y y

(x, y) (x, y) x x Domain of f Domain of f z FIGURE 13.21 Figure 2.1: value of f (x0, y0) is the height of the surface above the point (x0, y0, 0) Like functions of one variable, functions of two variables must pass a vertical line 1 2 = test. A relation of the form F x, y, z 0 is a function provided every line parallel to the z@axis intersects the graph of F at most once. For example, an ellipsoid (discussed in Sec- x y tion 13.1) is not the graph of a function because some vertical lines intersect the surface twice. On the other hand, an elliptic paraboloid of the form z = ax 2 + by 2 does represent An ellipsoid does not pass a function (Figure 13.22). the vertical line test: not the graph of a function.

QUICK CHECK 2 Does the graph of a hyperboloid of one sheet represent a function? Does z the graph of a cone with its axis parallel to the x-axis represent a function? ➤

EXAMPLE 2 Graphing two-variable functions Find the domain and range of the (a) the graph of z = x2 − y2 (b) the graph of z = x2 + y2 following functions. Then sketch a graph. 1 2 = + Figure- 2.2: graphs1 of two-variables2 = 2 + 2 functions a. f x, y 2x 3y 12 b. g x, y x y c. h1x, y2 = 21 + x 2 + y 2 2.1.2 LevelSOLUTION Set Diagrams Another common= way1 to visualize2 a two-variable function= + is through- its level+ sets:- = a. Letting z f x, y , we have the equation z 2x 3y 12, or 2x 3y z 12, n Definitionwhich 2.1 describes — Level a Sets. planeGiven with a normal a function vectorf (8x2,1, 3,... -,1x9n )(Section: R → 13.1).R, The a level domain set of the n 2 x y functionconsistsf is a of subset all points of R in ޒof,the and form: the range is ޒ. We sketch the surface by noting that the x-intercept is 16, 0, 02 1setting y = z = 02; the y-intercept is 10, 4, 02 and the This elliptic paraboloid 1 - 2 f (x1,..., xn) = c passes the vertical line test: z-intercept is 0, 0, 12 (Figure 13.23). graph of a function. where c is a constant. FIGURE 13.22 z Given f (x, y) = x2 + y2, which is a function from R2 to R. An example of a of f is x2 + y2 = 1, which is a unit circle on R2 centered atPlane the origin. ϭ ϭ ϩ Ϫ | {z } (6, 0, 0) z f (x, y) 2x 3y 12 f (x,y) By taking c to be differentx values, we get several level sets on the plane. They are centered at the origin with varying radii depending on the value of c chosen: (0, 4, 0) (0, 0, Ϫ12) c = 0 x2 +y y2 = 0 the origin only c = 1 x2 + y2 = 1 radius = 1 √ c = 2 x2 + y2 = 2 radius = 2 FIGURE 13.23 √ c = 3 x2 + y2 = 3 radius = 3

The level set diagram of the two-variable function f (x, y) consists of some representative level sets of the function on R2. See Figure 2.3. 2.1 Functions of Several Variables 21

3 16 12 10 14 16

14 6 2 8 12

10 2 1

0

-1

10 4 12 -2 14

16 -3 14 10 12 16 -3 -2 -1 0 1 2 3

Figure 2.3: level set diagram of f (x, y) = x2 + y2

For three-variable functions f (x, y, z), we do not attempt to visualize its graph, but we can visualize its level set diagram. The former requires the fourth while the latter can be visualized in R3.A generic level set of a three-variable function f (x, y, z) is a surface in R3.

(a) some level sets of x2 + y2 + z2 (b) some level sets of x3 + y2 − z2

Figure 2.4: level sets of three-variable functions 22 Partial Differentiations 2.2 Partial Derivatives 2.2.1 First Derivatives Given a multivariable function such as f (x, y), one can talk about derivatives with respect to both variables x and y. Taking partial derivatives means differentiating f (x, y) with respect to one of the variables while keeping the other variables fixed. Definition 2.2 — Partial Derivatives. Given a multivariable function f (x, y), we define

∂ f (x, y) := the derivative of f (x, y) with respect to x regarding y constant ∂x f (x + h, y) − f (x, y) = lim ; h→0 h ∂ f (x, y) := the derivative of f (x, y) with respect to y regarding x constant ∂y f (x, y + h) − f (x, y) = lim . h→0 h

Notation ∂ f ∂ f Alternatively, we sometimes denote ∂x by fx, and ∂y by fy. Note also that we do 0 not use f (x, y) for multivariable functions, since it is ambigious to whether it means fx or fy.

Computations of partial derivatives are as easy as single-variable derivatives, as illustrated in the following example:

Example 2.1 ∂ f ∂ f  Find ∂x and ∂y for the function:

f (x, y) = x2 sin(xy).

Solution ∂ f  To calculate ∂x , we regard y as a constant: ∂ f ∂   = x2 sin(xy) (regarding y constant) ∂x ∂x ∂x2 ∂ = sin(xy) + x2 sin(xy) (product rule) ∂x ∂x ∂ = 2x sin(xy) + x2 · cos(xy) xy ∂x = 2x sin(xy) + x2y cos(xy).

∂ f Similarly, to calculate ∂y , regard x as a constant:

∂ f ∂   = x2 sin(xy) ∂y ∂y ∂ = x2 sin(xy) (here x2 is regarded as a constant) ∂y ∂ = x2 cos(xy) · xy ∂y = x2 cos(xy) · x = x3 cos(xy). 2.2 Partial Derivatives 23

i Similar to single-variable calculus, to evaluate fx at a fixed point say (x, y) = (1, π), one should perform the differentiation first, and then substitute (x, y) = (1, π) into the derivative. Not the other way round! For example,

2 2 fx(1, π) = 2x sin(xy) + x y cos(xy) = 2 · 1 sin π + 1 · π cos(π) = −π. (x,y)=(1,π)

Geometric interpretation of partial derivatives The geometric meaning of the fx is illustrated in Figure 2.5. By keeping y constant (say we let y = b) and let x varies, the path traced on the surface is the curve of intersection between the plane y = b and the graph of the function. This curve is sometimes called an x-curve. The partial derivative fx(a, b) is the of the tangent to this x-curve at 918 Chapter 13 • Functions of(a Several, b): Variables

This is the vertical ... this is the plane y ϭ b and ... curve z ϭ f (x, b). z z ϭ f (x, y) z

y y (a, b, f (a, b))

x b x a The limit f a ϩ h b Ϫ f a b ( , ) ( , ) z ϭ f x b lim h ... the slope of the curve ( , ) h៬0 at (a, b, f (a, b)), which is f (a, b). equals ... x z z

y y

b x b x a ϩ h a a FIGURE 13.49 ∂ f Figure 2.5: geometric interpretation of ∂x provided this limit exists. Notice that the y-coordinate is fixed at y = b in this limit. If we Partialreplace derivatives 1a, b2 by of the function variable with point more 1 thanx, y2, two then variables fx becomes a function of x and y. = 1 2 PartialIn derivatives a similar of way, functions we can with move more thanalong two the variables, surface say z f (xf, yx,,z y), arefrom defined the point in an 1 1 22 ∂ f = a, b, f a, b in such a way that x a is fixed and only y varies. Now, the result is analogous way. For instance, ∂x is the derivative with respect to x regarding all other variables, = 1 2 ∂ f i.e.a tracey and describedz, constant. by Although z f a, ity is, notwhich easy is tothe interpret intersectiongeometrically of the surface since and the the graph plane of = 1 2 ∂x f x(x, y,az) (sitsFigure inside 13.50 a 4-dimensional). The slope of space, this curve the way at toa, computeb is given∂ f ,by∂ f theand ordinary∂ f are exactly derivative the 1 2 ∂x ∂y ∂z of f a, y with respect to y. This derivative is called the partial derivative of f with respect same as two-variable functions. to y, denoted 0f>0y or fy. When evaluated at 1a, b2, it is defined by the limit 1 + 2 - 1 2 f a, b h f a, b fy1a, b2 = lim , hS0 h

provided this limit exists. If we replace 1a, b2 by the variable point 1x, y2, then fy becomes a function of x and y.

f (a, b ϩ h) Ϫ f (a, b) ... the slope of the curve The limit lim h This is the vertical h៬0 z ϭ f (a, y) at (a, b, f (a, b)), plane x ϭ a and ... equals ... z f a b z which is y( , ). z

y (a, b, f (a, b)) y y

x b ϩ h x x b b b a a a ... this is the curve z ϭ f (a, y). FIGURE 13.50 24 Partial Differentiations

2 3 Example 2.2 x +y +xyz ∂ f  Given f (x, y, z) = e . Compute ∂z .

 Solution

∂ f ∂ 2 3 = ex +y +xyz ∂z ∂z 2 3 2 3 ∂(x + y + xyz) = ex +y +xyz · (chain rule) ∂z 2 3 = ex +y +xyz · (0 + 0 + xy) 2 3 = xyex +y +xyz.

2.2.2 Second Derivatives As in single-variable calculus, one can also talk about second derivatives for multivariable functions. Given a two-variable function f (x, y), its first partial derivatives fx and fy are also functions of x and y. Therefore, we can further differentiate them with respect to either x or y.

4 3  Example 2.3 Let f (x, y) = 3x y − 2xy + 5xy . Compute all first and second partial deriva- tives.

 Solution It is easy to see that

∂ f = 12x3y − 2y + 5y3 ∂x ∂ f = 3x4 − 2x + 15xy2 ∂y

Then, the second derivatives are:

∂  ∂ f  ∂ = (12x3y − 2y + 5y3) = 36x2y ∂x ∂x ∂x ∂  ∂ f  ∂ = (12x3y − 2y + 5y3) = 12x3 − 2 + 15y2 ∂y ∂x ∂y ∂  ∂ f  ∂ = (3x4 − 2x + 15xy2) = 12x3 − 2 + 15y2 ∂x ∂y ∂x ∂  ∂ f  ∂ = (3x4 − 2x + 15xy2) = 30xy. ∂y ∂y ∂y

  Notation ∂ ∂ f Since it is a bit clumsy to write ∂y ∂x every time, we can use the following short-hand: ∂2 f ∂  ∂ f  ∂2 f ∂  ∂ f  = = ∂x2 ∂x ∂x ∂x∂y ∂x ∂y ∂2 f ∂  ∂ f  ∂2 f ∂  ∂ f  = = ∂y∂x ∂y ∂x ∂y2 ∂y ∂y

Similarly for the subscript notations, we write fxx = ( fx)x, and fxy = ( fx)y. The later means to differentiate by x first and then by y. Therefore, it is related to the fraction notation by:

∂  ∂ f  ∂2 f f = ( f ) = = . xy x y ∂y ∂x ∂y∂x 2.2 Partial Derivatives 25

∂2 f The above remark seems to suggest that we should be very careful when converting ∂y∂x into the subscript notation fxy. The order of x and y needs to be switched in the conversion. However, thanks to the following important theorem, we don’t need to worry about this too much, since in many cases, we have fxy = fyx. Theorem 2.1 — Mixed Partials Theorem. Consider the function f (x, y), if at least one of the second partials fxy and fyx exists and is continuous, then we must have fxy = fyx.

Proof. Beyond the scope of this course. 

Although the theorem requires that fxy or fyx needs to be continuous, most functions we will encounter in this course are continuous and so this theorem applies. In Example 2.3, you may have already noticed that fxy and fyx are equal. The Mixed Partials Theorem tells us that it is not a coincident!

 Example 2.4 Consider the function: s esin x f (x, y) = √ + cos(xy). x2014 + x2012 + 1

∂  ∂ f  Find the second partial derivative ∂y ∂x .

Solution ∂ f  Needless to say, it is very tedious and time-consuming to compute ∂x . However, ∂  ∂ f  by the Mixed Partials Theorem, we can try to find ∂x ∂y , and if it is continuous, then ∂  ∂ f  ∂  ∂ f  ∂y ∂x = ∂x ∂y . ∂  ∂ f  It is much easier to compute ∂x ∂y since the “monster” term is gone after differentiating the function by y:

∂ f ∂ = 0 − sin(xy) · xy ∂y ∂y = −x sin(xy) ∂  ∂ f  ∂ = − (x sin(xy)) ∂x ∂y ∂x = − sin(xy) − xy cos(xy),

which is a . Therefore, fyx = fxy and so

∂  ∂ f  ∂  ∂ f  = = − sin(xy) − xy cos(xy). ∂y ∂x ∂x ∂y

i The Mixed Partials Theorem also applies to functions of more than two variables and to higher-order derivatives. Let’s say given a function of four variables f (x, y, z, t), we have:

fxyyyzt = fxyztyy = fyyzxty = .... 26 Partial Differentiations 2.3 Chain Rule 2.3.1 Multivariable Chain Rule In single-variable calculus, we apply the chain rule when there is a chain of relations between variables. For example, if f (x) is a function of x, and x is in turn a function of t, then f is d f ultimately a function of t. The derivative dt can be calculated by: d f d f dx = . dt dx dt One may represent this chain of relations by the schematic diagram:

f

d f dx x dx dt t

d f d f dx Figure 2.6: the schematic diagram for chain rule formula dt = dx dt .

For multivariable functions, the relation between variables can be more complicated. For example, let the function u(x, y, z) be the at the point (x, y, z) in the space. Suppose a particle moves along the path

r(t) = x(t)i + y(t)j + z(t)k.

Then, the coordinates x, y and z all depend on t, and so u is ultimately a function of t. See Figure 2.7 for the tree diagram of the variables.

u ∂u ∂u ∂x ∂u ∂z ∂y x y z

dx dy dz dt dt dt

t t t

Figure 2.7: tree diagram of variables for u(x, y, z).

du The derivative dt is the rate of change of the temperature that the particle “feels” as it travels. The multivariable chain rule can be read off from the tree diagram 2.7: du ∂u dx ∂u dy ∂u dz = + + . dt ∂x dt ∂y dt ∂z dt

du Precisely, to write down the chain rule for dt , we find all possible paths from u to t in the tree diagram. Each path consists of some segments. A segment, say, from u to x represents the ∂u derivative ∂x . To write down the chain rule, we “multiply” all segments of each path, and “add” up all the paths.

i In this course, our focus is on how to use the chain rule and how to write it down, but not the proof of the chain rule. The proof is usually taught in advanced real courses.

Let’s see a concrete example of the use of the multivariable chain rule: 2.3 Chain Rule 27

Example 2.5 2 2 2 du  Let u(x, y, z) = x + y − 2z and hx, y, zi = hcos t, sin t, ti. Compute dt .

 Solution Although it can be done by substituting x = cos t, y = sin t and z = t into 2 2 2 du u(x, y, z) = x + y − 2z and then compute dt directly, let’s try to use the chain rule to do it. From the tree diagram (Figure 2.7), we have:

du ∂u dx ∂u dy ∂u dz = + + dt ∂x dt ∂y dt ∂z dt = 2x ·(− sin t) + 2y · cos t +(−2z) · 1 (calculate each derivative) |{z} | {z } |{z} |{z} |{z} |{z} ∂u dx ∂u dy ∂z dz ∂x dt ∂y dt ∂t dt = −2 cos t sin t + 2 sin t cos t − 2t (write x, y and z in terms of t) = −2t.

As illustrated in the above example, once the chain rule formula is correctly written according to the tree diagram, the remaining computations are straight-forward. From now on, we will investigate how to write down the chain rule under various configuration of variables. The computations will usually be skipped and are left as exercises for readers.

Examples with more diversed variable configurations Suppose now that the temperature distribution is changing over time as well, i.e. the tem- perature u(x, y, z, t) is a function of four variables. Again, a particle travels along a path r(t) = x(t)i + y(t)j + z(t)k. Then, the tree diagram of variables in this case can be drawn as in Figure 2.8.

u

∂u ∂u ∂x ∂u ∂t ∂y ∂u ∂z x y z t

dx dy dz dt dt dt

t t t

Figure 2.8: tree diagram of variables for u(x, y, z, t).

du There are four paths from u to t, so we expect the chain rule formula for dt consists of four terms: du ∂u dx ∂u dy ∂u dz ∂u = + + + . dt ∂x dt ∂y dt ∂z dt ∂t

du ∂u i Note that dt is different from ∂t . As the particle travels along its path, the temperature the particle “feels” is a ultimately a function of t, and so the rate of change of temperature du is represented by dt (using d instead of the partial ∂). ∂u On the other hand, the partial ∂t is the of u regarding x, y and z constant! Therefore, it is the rate of change of temperature when the position is fixed! It is not the rate of change of temperature for the moving particle!

Now consider a slightly more complicated example. Let w be a function of x, y and z, and each of x, y and z is a function of s and t, as illustrated in Figure 2.9. The function w is ultimately a function of s and t. 932 Chapter 13 • Functions of Several Variables

932 Chapter 13 • Functions Followingof Several Variables the branches of Figure 13.58 connecting z to t, we have 0z 0z 0x 0z 0y Following =the branches + of Figure 13.58 connecting z to t, we have 0t 0x 0t 0y 0t 0z 0z 0x 0z 0y = + # # 0 t = 20x cos0t 2x0 ycos0t 3y 1 + 1-3 sin 2x sin 3y2 -1 5 e g # 0 h # 0 = 2 cos 2x0 cosz 3y 1 +x1-3 sin 2x sin0 3y2 -1 y 5 z e g 0 0 0x 0t h 0y 0 z x 0z 0y t 0 0t 0 x 0y t = 2 cos 121s + t22 cos 131s - t22 + 3 sin 121s + t22 sin 131s - t22. = 1 1s + t22 1 1s - t22 + 1 1s + t22 1 1s - t22

2 cos 2 d cos 3 d 3 sin 2 sin d 3 . d

28 Partial Differentiations d x d y d x d y

x y x y ➤ ∂w Related Exercises➤ 19–26 There are three paths from w to s, so the chain rule for ∂s is given by the following: Related Exercises 19–26 ∂w ∂w ∂x ∂w ∂y ∂w ∂z = + + . ∂s ∂x ∂s ∂y ∂s ∂z ∂s EXAMPLEEXAMPLE 33 MoreMore variables variables Let wLet be wa functionbe a function of x, y, andof xz, eachy, and of whichz, each is aof which is a functionfunction of of s s andand tt.. w a. Draw a labeled tree diagram showing the relationships among the variables. w a. Draw a labeled tree diagram showing the relationships among the variables. w w 0w w x w w z b. Write the Chain Rule formula for . y 0 0w x w z b. Write the Chain Rule formula fors . xzy y 0 SOLUTION s xzx x y y y z z s t s t s t x x y y z z SOLUTIONa. Because w is a function of x, y, and z, the upper branches of the tree (Figure 13.59) are s t sts t s t st s t labeled with the partial derivatives wx, wy, and wz. Each of x, y, and z is a function of FIGURE 13.59 a. Becausetwo variables, w is a so function the lower of branches x, y, and of the z, treethe alsoupper require branches partial ofderivative the tree labels. (Figure 13.59) are stt s t s b.labeled Extending with Theorem the partial 13.8, derivativeswe take the three wx, pathswy, and through wz. Eachthe tree of that x, connecty, and zw isto as function of QUICK CHECK 3 If Q is a function of FIGURE 13.59 two variables, so the lower branches of the tree also require partial derivative labels. Figurew, x, y 2.9:, and tree z, diagram each of which is a func- (red branches in Figure 13.59). Multiplying the derivatives that appear on each path tion of r, s, and t, how many depen- b. Extendingand adding Theorem gives the result 13.8, we take the three paths through the tree that connect w to s QUICK CHECK 3 If Q is a function of

A “multi-level” example dent variables, intermediate variables, (red branches in Figure 13.59).0w 0 wMultiplying0x 0w 0y the0w derivatives0z that appear on each path ➤ = + + w, x, y, and z, each of which is a func- .

Now suppose w is a function of z only,andz isindependent a function of xvariablesand y, and are both there?x and y areand functions adding gives the result0s 0x 0s 0y 0s 0z 0s of t, as illustrated in Figuretion 2.10.of r, s, and t, how many depen- ➤ Ultimately, w is a function t. There are two paths from w to t in the tree diagram. Each path Related Exercises 19–26

dent variables, intermediatedw variables, 0 0 0 0 0y 0 0

consists of three segments. The chain rule for dt is given by: w w x w w z ➤ = + +

and independent variables are there? It is probably clear by now that we can create a Chain Rule for any set. of relationships

dw dw ∂z dx dw ∂z dy 0s 0x 0s 0y 0s 0z 0s = + . among variables. The key is to draw an accurate tree diagram and label the branches of the ➤ dt dz ∂x dt dz ∂y dt tree with the appropriate derivatives. Related Exercises 19–26

w EXAMPLEIt is probably 4 A clear different by nowkind thatof tree we Let can w createbe a function a Chain of z Rule, where for z is any a function set of relationships dw amongof x and variables. y, and each The of xkey and is y tois adraw function an ofaccurate t. Draw treea labeled diagram tree diagram and label and writethe branches of the > dz treethe with Chain the Rule appropriate formula for derivatives.dw dt. z z z x y SOLUTION The dependent variable w is related to the independent variable t through two paths in the tree: w S z S x S t and w S z S y S t (Figure 13.60). At the top of the wxy EXAMPLE 4 A different kind of tree Let w be a function of z, where z is a function tree, w is a function of the single variable z, so the rate of change is the ordinary deriva- dx dy dw of x and >y, and each of x and y is a function of t. Draw a labeled tree diagram and write dt dt tive dw dz. The tree below z looks> like Figure 13.54. Multiplying the derivatives on each dz theof Chain the two Rule branches formula connecting for dw w todt t,. and adding the results, we have z z ttz SOLUTION The dependentdw dw 0 zvariabledx dw w0z isdy relateddw to0z dxthe independent0z dy variable t through two x FIGUREy 13.60 = + = a + b.

Figure 2.10: tree diagram paths in the tree: wdtS zdzS0x dtS t anddz 0 ywdtS z dzS y0xSdtt (Figure0y dt 13.60). At the top of the xy ➤ tree, w is a function of the single variable z, so the rate ofRelated change Exercises is the 27–30ordinary deriva- dx dy 2.3.2 Implicit Differentiation tive dw>dz. The tree below z looks like Figure 13.54. Multiplying the derivatives on each dt dt Implicit Differentiation Given an implicit equation such as: of the two branches connecting w to t, and adding the results, we have Using the Chain Rule for partial derivatives, the technique of implicit differentiation can be put ttx2 + y3 + sin2 y = 1, in a larger perspective.dw Recalldw that0z ifdx x and ydw are 0relatedz dy throughdw an0 implicitz dx relationship,0z dy such FIGURE 13.60 + p 2 = = > + = a + b

it is very difficult (possibly impossible) to express y in terms of x. In single-variableas calculus, sin xy y x, then dy 0dx is computed 0 using implicit differentiation0 0 (Section . 3.7).

dy dt dz> x dt dz y dt dz 1 x2dt= y dt+ p 2 - we learned how to find using implicit differentiation – regard y as a function of x, and dx Another way to compute dy dx is to define the function F x, y sin xy y x. ➤ dy + p 2 = 1 2 = Related Exercises 27–30 differentiate both sides by x then solve for dx . Notice that the original relationship sin xy y x is F x, y 0. The multivariable chain rule offers an alternative approach to implicit differentiation. Implicit Differentiation Using the Chain Rule for partial derivatives, the technique of implicit differentiation can be put in a larger perspective. Recall that if x and y are related through an implicit relationship, such as sin xy + py 2 = x, then dy>dx is computed using implicit differentiation (Section 3.7). > 1 2 = + p 2 - Another way to compute dy dx is to define the function F x, y sin xy y x. + p 2 = 1 2 = Notice that the original relationship sin xy y x is F x, y 0. 2.3 Chain Rule 29

F ∂F ∂F ∂x ∂y

x y

dy dx x

Figure 2.11: tree diagram for implicit differentiation

Define F(x, y) = x2 + y3 + sin2 y, then the above implicit equation can be written as F(x, y) = 1. Regarding y as a function of x, F(x, y) is ultimately a function of x. Figure 2.11 shows the tree diagram for the variables. Therefore, by the chain rule, we have

dF ∂F ∂F dy = + . dx ∂x ∂y dx

dF Recall that F(x, y) = 1 is a constant, so dx = 0, which yields:

dy dy Fx 0 = Fx + Fy , and so: = − . dx dx Fy

It is a much straight-forward formula than the implicit differentiation method learned in single-variable calculus. When F(x, y) = x2 + y3 + sin2 y, we have:

dy 2x = − . dx 3y2 + 2 sin y cos y 30 Partial Differentiations 2.4 Directional Derivatives ∂ f Recall that the physical meaning of the partial derivative ∂x is the rate of change of f in the ∂ f direction of x. The geometric interpretation can be found in Figure 2.5. Similarly, ∂y is the rate of change of f in the direction of y. In this section, we introduce the way to find the rate of change of f in any other directions, i.e. defined as follows. Definition 2.3 — Directional Derivative. Given a unit direction u = u1i + u2j and a two- variable function f (x, y), the directional derivative of f in the direction of u at point (x, y) is denoted and defined to be:938 Chapter 13 • Functions of Several Variables

d Du f (x, y) = f (x + tu1, y + tu2) . dt 13.6 Directionalt=0 Derivatives and the

Partial derivatives tell us a lot about the rate of change of a function on its domain. How- i When u = i, then u1 = 1 and u2 = 0 and so ever, they do not directly answer some important questions. For example, suppose you d f (x + t y) − f (x y) f 1 1 22 = 1 2 , , are standing∂ at a point a, b, f a, b on the surface z f x, y . The partial derivatives Di f (x, y) = f (x + t, y) = lim = (x, y). → dt t=0 t 0 t fx and∂x fy tell you the rate of change (or slope) of the surface at that point in the directions parallel to the x-axis and y-axis, respectively. But you could walk in an infinite number of directions from that point and find a different rate of change in every direction. With this observation in mind, we pose several questions. We seek the z rate of change • Suppose you are standing on a surface and you walk in a direction other than a coor- f P dinate direction—say, northwest or south-southeast. What is the rate of change of the of at 0 in the direction of u. function in such a direction? • Suppose you are standing on a surface and you release a ball at your feet and let it roll. (a, b, f (a, b)) In which direction will it roll?

z f (x, y) • If you are hiking up a mountain, in what direction should you walk after each step if you want to follow the steepest path? These questions will be answered in this section as we introduce the directional deriva- tive, followed by one of the central concepts of calculus—the gradient. x P (a, b) u 0 y Directional Derivatives Unit vector u 1 1 22 = 1 2 Let a, b, f a, b be a point on the surface z f x, y and let u be a unit vector in the FIGURE 13.63 xy-plane (Figure 13.63). Our aim is to find the rate of change of f in the direction u at Figure 2.12: directional derivative 1 2 1 2 1 2 = 8 9 P0 a, b . In general, this rate of change is neither fx a, b nor fy a, b (unless u 1, 0 ͉u͉ ϭ 1 or u = 80, 19), but it turns out to be a combination of f 1a, b2 and f 1a, b2. u x y In practice, we do not need to compute D f fromϭ the definition, since we have Theorem 2.2 u u u2 sin Figure 13.64a shows the unit vector u at an angle to the positive x-axis; its compo- = 8 9 = 8 u u9 below to help us. In order to introduce this theorem, we first define:nents are u u1, u2 cos , sin . The derivative we seek must be computed along u ϭ cos / Definition 2.4 — Gradient Vector. Given1 a two-variable function fthe(x, yline), the in gradient the xy-plane vector through P0 in the direction of u. A neighboring point P, which is ( ) (a) h units from P along /, has coordinates P1a + h cos u, b + h sin u2 (Figure 13.64b). of f at x, y is denoted and defined as: ᐉ 0 Now imagine the plane Q perpendicular to the xy-plane, containing /. This plane cuts y P(a ϩ h cos , b ϩ h sin ) = 1 2 ∂ f ∂ f the surface z f x, y in a curve C. Consider two points on C corresponding to P0 and P; ∇ f (x, y) = (x, y)i + (x, y)j. 1 2 1 + u + u2 ∂x ∂y they have z-coordinates f a, b and f a h cos , b h sin (Figure 13.65). The slope u h sin of the between these points is 2 3 h As an example, let f (x, y) = x y + x , then 1 + u + u2 - 1 2 f a h cos , b h sin f a, b . P (a, ∂b)f h cos h 0 (x, y) = 2xy + 3x2 ϭ∂͗x ͘ ϭ ͗ ͘ S u u1, u2 cos , sin The derivative of f in the direction of u is obtained by letting h 0; when the limit ex- ∂ f 2 ists, it is called the directional derivative of f at 1a, b2 in the direction of u. It gives the (x, y) = x x ∂y slope of the line tangent to the curve C in the plane Q. (b) Therefore, FIGURE 13.64 ∇ f (x, y) = (2xy + 3x2)i + x2j. DEFINITION Directional Derivative Let f be differentiable at 1a, b2 and let u = 8cos u, sin u9 be a unit vector in the The vector ∇ f (x, y) depends on (x, y). By taking different values of (xxy, y-plane.), a different The directional gradient derivative of f at 1a, b2 in the direction of u is 1 + u + u2 - 1 2 f a h cos , b h sin f a, b 1 2 = Du f a, b lim , hS0 h provided the limit exists.

QUICK CHECK 1 Explain why, when u = 0 in the definition of the directional derivative,

the result is fx1a, b2 and when u = p>2, the result is fy1a, b2. ➤ 2.4 Directional Derivatives 31

vector is produced. For instance, ∇ f (1, 1) = 5i + j, ∇ f (1, 0) = 3i + j.

Theorem 2.2 Given a two-variable function f (x, y), the directional derivative of f at (x, y) in the unit direction u = u1i + u2j is given by:

Du f (x, y) = ∇ f (x, y) · u.

By applying this theorem in the special case u = i, we can see  ∂ f ∂ f  ∂ f ∇ f (x, y) · u = (x, y)i + (x, y)j · i = (x, y) = D f (x, y) ∂x ∂y ∂x i

∂ f as expected. Similarly, we can see ∇ f (x, y) · j = ∂y (x, y) = Dj f (x, y) as expected. Let’s see the proof of the general case:

Proof of Theorem 2.2. The key idea is to use the chain rule. The directional derivative Du f (x0, y0) at the point (x0, y0) is the rate of change of f along the path r(t) = hx0, y0i + thu1, u2i, i.e. x = x0 + u1t and y = y0 + u2t. By definition of directional derivative, Du f (x0, y0) is the d derivative dt f (x0 + u1t, y0 + u2t) at t = 0. Therefore, f is a function of (x, y), and (x, y) are functions of t. By the chain rule, we have: d f D f = u dt ∂ f dx ∂ f dy = + ∂x dt ∂y dt ∂ f d ∂ f d = (x + tu ) + (y + tu ) ∂x dt 0 1 ∂y dt 0 2 ∂ f ∂ f = · u + · u . ∂x 1 ∂y 2 On the other hand,  ∂ f ∂ f  h∇ f , ui = i + j, u i + u j ∂x ∂y 1 2 ∂ f ∂ f = · u + · u . ∂x 1 ∂y 2

Therefore, Du f = h∇ f , ui.  This theorem tells us that the computation of directional derivatives amounts to computing the gradient vector and a dot product. Easy enough? As an example, given f (x, y) = x2y + x3. We worked out in the previous example that ∇ f (1, 1) = 5i + j, and so the directional derivative of f at (1, 1) along u = √1 i + √1 j is given 2 2 by:  1 1   1 1  6 √ √ √ √ √ D √1 i+ √1 j f (1, 1) = ∇ f (1, 1) · i + j = (5i + j) · i + j = . 2 2 2 2 2 2 2

2.4.1 Geometric Interpretation of Gradient Vectors

Theorem 2.2 not only tells us how to compute the directional derivative Du f (x, y), but also tells us in what direction u the derivative Du f (x, y) is the greatest and the smallest. Given a function f (x, y), fix a point (a, b). From the dot product formula 1.1, we know:

Du f (a, b) = ∇ f (a, b) · u = |∇ f (a, b)| |u| cos θ = |∇ f (a, b)| cos θ |{z} =1 32 Partial Differentiations

u

θ ∇ f (a, b)

Figure 2.13: geometric interpretation of Du f

where θ is the angle between ∇ f (a, b) and u. Since cos θ is the largest when θ = 0 at which cos θ = 1, the directional derivative Du f (a, b) is maximized when ∇ f (a, b) is parallel to u. Therefore, ∇ f (a, b) is pointing in the direction at which f increases most rapidly from (a, b). It is quite intuitive that in order to increase the value of f most rapidly, one should go 944 Chapter 13 • Functions of Several Variables along the direction perpendicular to the level curve. It is indeed true. Let’s state it as a theorem: maximum decrease 1 2 = 8 - 9 p> y b. The gradient f 2, 3 12, 12 makes an angle of 7 4 with the positive Theorem 2.3 Let f (x, y) be a two-variable f, D f function,(2, 3) Ϸ 17 and (a, b) be a point on the level curve u x-axis. So, the maximum rate of change of f occurs in this direction, and that rate of f (x, y) = c. Then the gradient vector ∇ f (a, b) iszero orthogonal change to the level curve ͉f(x1, y) =2͉ c=at͉ 8 - 9 ͉ = 1 Ϸ change is f 2, 3 12, 12 12 2 17. The direction of maximum the point (a, b). d, D f (2, 3) 0 u decrease is opposite to the direction of the gradient, which corresponds to u = 3p>4. The maximum rate of decrease is the negative of the maximum rate of increase, or Proof. Let r(t) = x(t)i + y(t)j be a parametrization of the level curve f (x, y) = c. In other -1212 Ϸ -17. The function has zero change in the directions orthogonal to the words, we have f (x(t), y(t)) = c forlevel any t, and so curve (2, 3) gradient, which correspond to u = p>4 and u = 5p>4. d maximum Figure 13.70 summarizes these conclusions. Notice that the gradient at 12, 32 f (x(t) y(t)) = , increase0. appears to be orthogonal to the level curve of f passing through 12, 32. We next see dt j, D f (2, 3) Ϸ 17

u that this is always the case. By the chain rule, we get: ➤ x h, D f (2, 3) 0 Related Exercises 33–42 u ∂ f dx ∂ f dy zero change + = 0 FIGURE 13.70 ∂x dt ∂y dt The Gradient and Level Curves     1 2 ∂ f ∂ f dx dy Theorem 13.11 states that in any direction orthogonal to the gradient f a, b , the func- i + j · i + j = 0 1 2 1 2 = ∂x ∂y dt dt tion f does not change at a, b . Recall from Section 13.2 that the curve f x, y z0, ∇ f · r0(t) = 0. where z0 is a constant, is a level curve, on which function values are constant. Combining 1 2 these two observations, we conclude that the gradient f a, b is orthogonal to the line 0 Therefore, the gradient vector ∇ f is orthogonal to r (t) which is thetangent tangent to vector the level of thecurve level through 1a, b2. curve r(t). It completes the proof. 

THEOREM 13.12 The Gradient and Level Curves y Given a function f differentiable at 1a, b2, the line tangent to the level curve of f f (a, b) 1 2 1 2 1 2 ϶ t at a, b is orthogonal to the gradient f a, b , provided f a, b 0.

(a, b) (c, d) = 1 2 Proof: A level curve of the function z f x, y is a curve in the xy-plane of the form 1 2 = f x y z t Level curve: ( , ) 0 f x, y z0, where z0 is a constant. By Theorem 13.9, the slope of the line tangent to the f (c, d) 1 2 = - > level curve is y x fx fy. It follows that any vector that points in the direction of the tangent line at the point x 8 9 1a, b2 is a scalar multiple of the vector t = -fy1a, b2, fx1a, b2 (Figure 13.71). At that 1 2 = 8 1 2 1 29 FIGURE 13.71 same point, the gradient points in the direction f a, b fx a, b , fy a, b . The dot ∇ ( ) ( ) 1 2 Figure 2.14: f a, b is orthogonal to the level curve ofproductf at ofa, bt and f a, b is

➤ # 1 2 = 8- 9 # 8 9 = 1- + 2 =

We have used the fact that the vector t f a, b fy, fx 1a,b2 fx, fy 1a,b2 fx fy fx fy 1a,b2 0, 8 9 > ➤ 2.4.2 Directional Derivative of Three-Variablea, b has slope Functions b a. 1 2 which implies that t and f a, b are orthogonal. The directional derivative and the gradient vector for functions f (x, y, z) of three variables is An immediate consequence of Theorem 13.12 is an alternative equation of the tangent defined in an analogous way as for two-variable functions. Precisely, we have: 1 2 = line. The curve described by f x, y z0 can be viewed as a level curve in the xy-plane for a 1 2 1 2 ∂ f ∂ f ∂ f surface. By Theorem 13.12, the line tangent to the curve at a, b is orthogonal to f a, b . ∇ f = i + j + k 1 2 1 2 # 8 - - 9 = ∂x ∂y ∂z Therefore, if x, y is a point on the tangent line, then f a, b x a, y b 0, 1 2 = which, when simplified, gives an equation of the line tangent to the curve f x, y z0:

fx1a, b21x - a2 + fy1a, b21y - b2 = 0.

QUICK CHECK 4 Draw a circle in the xy-plane centered at the origin and regard it is as a level curve of the surface z = x 2 + y 2. At the point 1a, a2 of the level curve in the

xy-plane, the slope of the tangent line is -1. Show that the gradient at 1a, a2 is orthogonal to the tangent line. ➤

EXAMPLE 6 and level curves Consider the upper sheet = 1 2 = 2 + 2 + 2 z f x, y 1 2x y of a hyperboloid of two sheets. a. Verify that the gradient at 11, 12 is orthogonal to the corresponding level curve at that point. b. Find an equation of the line tangent to the level curve at 11, 12. 13.6 Directional Derivatives and the Gradient 945

SOLUTION

➤ The fact that y = -2x>y may also a. You can verify that 11, 1, 22 is on the surface; therefore, 11, 12 is on the level curve be obtained using Theorem 13.9: If corresponding to z = 2. Setting z = 2 in the equation of the surface and squaring 1 2 = 1 2 = - > 2 2 2 2 F x, y 0, then y x Fx Fy. both sides, the equation of the level curve is 4 = 1 + 2x + y , or 2x + y = 3, which is the equation of an (Figure 13.72). Differentiating 2x 2 + y 2 = 3 with z respect to x gives 4x + 2yy1x2 = 0, which implies that the slope of the level curve is z ͙1 2x2 y2 1 2 = - 2x 1 2 - y x . Therefore, at the point 1, 1 , the slope of the tangent line is 2. Any y z 2 (1, 1, 2) vector proportional to t = 81, -29 has slope -2 and points in the direction of the tangent line. Level curve We now compute the gradient: for z 2 2.4 Directional Derivatives 33 2x y Tangent to 1 1 2 = 8 9 = h i f x, y fx, fy , . level curve 2 2 2 2 1 y 21 + 2x + y 21 + 2x + y at (1, 1) 1.0 ͗ ͘ 0.5 1 2 = 8 1 9 t 1, 2 It follows that f 1, 1 1, (Figure 13.72). The tangent vector t and the gradient f (1, 1) ͗1, q͘ 2 0.8 are orthogonal because x 0.7 0.8 # 1 2 = 8 - 9 # 8 1 9 = t f 0 t f 1, 1 1, 2 1, 2 0. f is orthogonal to level curves. b. An equation of the line tangent to the level curve at 11, 12 is 0.4 0.6 0.6 FIGURE 13.72 fx11, 121x - 12 + fy11, 121y - 12 = 0,

d d 1 1 0.2 Ball is released 2 0.4 z at (3, 4, 61). = - +

2 2 or y 2x 3. z 4 x0.1 3y ➤ 0.3 Related Exercises 43–50

0.2 EXAMPLE 7 Path of steepest descent Consider the paraboloid = 1 2 = + 2 + 2 1 2 Path of z f x, y 4 x 3y (Figure 13.73). Beginning at the point 3, 4, 61 on the 0.0 steepest surface, find the path in the xy-plane that points in the direction of steepest descent on the descent on surface surface. 0.0 0.2 0.4 0.6 0.8 1.0 Level SOLUTION Imagine releasing a ball at 13, 4, 612 and assume that it rolls in the direc- curves Figure 2.15: a plot of the field ∇ (sin xy) and the level curvestion of of steepest sin xy. descent at all points. The projection of this path in the xy-plane points - 1 2 = 8- - 9 1 2 y in the direction of f x, y 2x, 6y , which means that at the point x, y the line tangent to the path has slope y1x2 = 1-6y2>1-2x2 = 3y>x. Therefore, the path in Given a unit vector u, the directional derivativex Projection of f (x, yof, zpath) in of the direction of u is given by: steepest descent on the xy-plane satisfies y1x2 = 3y>x and passes through the initial point 13, 42. You can 4x3 verify that the solution to this is y = 4x 3>27 and the projection of Du f = ∇xy-plane,f · u. y , 27 the path of steepest descent in the xy-plane is the curve y = 4x 3>27. The descent ends at is orthogonal to 1 2 Since the level set of a three-variable functionlevel is typicallycurves. a surface,0, the0 , gradientwhich corresponds vector ∇ fto the vertex of the paraboloid (Figure 13.73). At all points of

at any given point is orthogonalFIGURE to the 13.73 level surface f (x, y, z) = c at thatthe descent, point. See the Figurecurve in 2.16. the xy-plane is orthogonal to the level curves of the surface. Related Exercises 51–54 ➤

Gradient vector at (a, b, c) is = 3> 1 2 = >

orthogonal to level surface. QUICK CHECK 5 Verify that y 4x 27 satisfies the equation y x 3y x, with 1 2 = ➤ z y 3 4. The Gradient in Three Dimensions The directional derivative, the gradient, and the idea of a level curve extend immediately = 1 2 to functions of three variables of the form w f x, y, z . The main differences are that the gradient is a vector in ޒ3 and level curves become level surfaces (Section 13.2). Here is f (a, b, c) how the gradient looks when we step up one dimension. = 1 2 The easiest way to visualize the surface w f x, y, z is to picture its level surfaces— x 3 y the surfaces in ޒ on which f has a constant value. The level surfaces are given by the 1 2 = FIGURE 13.74 equation f x, y, z C, where C is a constant (Figure 13.74). The level surfaces can be Figure 2.16: ∇ f (a, b, c) is orthogonal to the level surfacegraphed, of f at ( aand, b, cthey) may be viewed as layers of the full four-dimensional surface (like lay- ers of an onion). With this image in mind, we now extend the concept of a gradient. In the next section, we will use this fact to find the equation of the tangent plane to a surface. 34 Partial Differentiations 2.5 Tangent Planes

In single-variable calculus, the tangent line to the curve y = f (x) at a point (x0, f (x0)) has 0 slope equal to f (x0). Using the slope, one can easily write down the equation of the tangent line as: 0 y = f (x0) + f (x0)(x − x0). In , the graph of a two-variable function z = f (x, y) is no longer a curve but a surface. Therefore, there are infinitely many tangent lines passing through any given point on a surface. However, there is only one tangent plane at any given point. In this section, we want to find an equation for the tangent plane. Recall that the two ingredients of finding an equation of a plane are: • a normal vector to the plane; and • any given point on the plane. Naturally, the point can be taken to be the contact point of the surface (x0, y0, z0). To find the normal vector, we will use the gradient vector. In the previous section, we see that given a three-variable function g(x, y, z), the gradient vector ∇g(x, y, z) is perpendicular to the level surface g(x, y, z) = c. In other words, ∇g is a normal vector of the level surface g(x, y, z) = c.

 Example 2.6 Find the tangent plane to the surface

x2 + y2 = z2 + 3

at the point (x, y, z) = (2, 0, −1).

 Solution First we need to write the equation of the surface in a level set form, i.e.

x2 + y2 − z2 = 3

Define g(x, y, z) = x2 + y2 − z2, then the given surface is a level set g(x, y, z) = 3. By direct computations,

∇g(x, y, z) = 2xi + 2yj + (−2z)k ∇g(2, 0, −1) = 4i + 2k.

Then, n := 4i + 2k is a normal vector of the surface at (2, 0, −1). The equation of the tangent plane at (2, 0, −1) is given by:

4x + 0y + 2z = 4(2) + 0(0) + 2(−1), 4x + 2z = 6.

Given a two-variable function f (x, y), the graph z = f (x, y) is a surface. In order to find the tangent plane at a given point, one can rewrite the graph equation z = f (x, y) as:

z − f (x, y) = 0.

Then, one can define g(x, y, z) = z − f (x, y) so that the graph of the two-variable function f (x, y) becomes a level set of a three-variable function g(x, y, z). Let’s look at an example: 2.5 Tangent Planes 35

x  Example 2.7 Given the function f (x, y) = x cos y − ye , find the tangent plane at (0, 0, 0) to the graph z = x cos y − yex.

 Solution First rearrange the terms so that the graph equation becomes a level set:

z − x cos y + yex = 0.

Define g(x, y, z) = z − x cos y + yex, then the surface under consideration is the level set g(x, y, z) = 0.

∇g(x, y, z) = (− cos y + yex)i + (x sin y + ex)j + k. The normal vector at (0, 0, 0) is given by:

n = ∇g(0, 0, 0) = −i + j + k.

The equation of the tangent plane at (0, 0, 0) to the graph is:

−x + y + z = 0.

Generally, one can derive a formula for finding the tangent plane of any graph of a two-variable function f (x, y): Theorem 2.4 The equation of the tangent plane for the graph z = f (x, y) at the point (x0, y0, f (x0, y0)) is given by: ∂ f ∂ f z = f (x , y ) + (x , y ) · (x − x ) + (x , y ) · (y − y ). 0 0 ∂x 0 0 0 ∂y 0 0 0

Proof. Write the graph equation z = f (x, y) as:

z − f (x, y) = 0 and define g(x, y, z) = z − f (x, y), then the graph can be regarded as a level set g(x, y, z) = 0 of the three-variable function g.

∂ f ∂ f ∇g(x, y, z) = − i − j + k. ∂x ∂y

At the point (x0, y0, f (x0, y0)), the normal vector to the surface is therefore given by: ∂ f ∂ f n = ∇g(x , y , f (x , y )) = − (x , y )i − (x , y )j + k. 0 0 0 0 ∂x 0 0 ∂y 0 0

The equation of the tangent plane at (x0, y0, f (x0, y0)) is:

 ∂ f   ∂ f   ∂ f   ∂ f  − (x , y ) x + − (x , y ) y + z = − (x , y ) x + − (x , y ) y + f (x , y ) ∂x 0 0 ∂y 0 0 ∂x 0 0 0 ∂y 0 0 0 0 0

By rearrangement, we get:

∂ f ∂ f z = f (x , y ) + (x , y ) · (x − x ) + (x , y ) · (y − y ), 0 0 ∂x 0 0 0 ∂y 0 0 0 as desired.  36 Partial Differentiations 2.6 Local Extrema In single-variable calculus, the role that tangent lines play in optimization problems is that having a horizontal tangent line at a point indicates the point is critical. This critical point is a candidate for local maximum or minimum. The second derivative may be used to determine whether the critical point is a local maximum or a local minimum. In this section, we will extend the concept of critical points to two-variable functions, and introduce the second for these functions.

2.6.1 Critical Points ∂ f ∂ f We focus on the case where both ∂x and ∂y exist so that we don’t need to worry whether the function is differentiable or not. Recall that the equation of the tangent plane to the graph z = f (x, y) at a point (a, b) is given by:

z = f (a, b) + fx(a, b)(x − a) + fy(a, b)(y − b).

This plane is horizontal if and only if fx(a, b) = fy(a, b) = 0. This motivates the definition of critical points for two-variable functions: Definition 2.5 — Critical Points. ∂ f ∂ f Given a function f (x, y) whose partial derivatives ∂x and ∂y exist everywhere. A point (a, b) is said to be a critical point if the tangent plane at (a, b) to the graph z = f (x, y) is horizontal. ∂ f ∂ f Therefore, (a, b) is a critical point ⇔ ∂x (a, b) = ∂y (a, b) = 0 ⇔ ∇ f (a, b) = 0.

i As in single-variable calculus, the critical points are just candidates of maximum/minimum.

2 2  Example 2.8 Find all critical point(s) of the function f (x, y) = xy − x − y − 2x − 2y + 4.

Solution ∂ f ∂ f  We compute ∂x and ∂y , and then set them to zero and solve for (x, y):

∂ f 0 = = y − 2x − 2 ∂x ∂ f 0 = = x − 2y − 2. ∂y

It is a system of equations with unknowns x and y. The first equation gives y = 2x + 2, and substitute it into the second equation, we get:

x − 2(2x + 2) − 2 = 0 ⇒ x = −2.

When x = −2, we have y = −2, and so (x, y) = (−2, −2) is a critical point of f (x, y). See Figure 2.17a for its graph. 2.6 Local Extrema 37

 Example 2.9 Find all critical point(s) of the function f (x, y) = sin x sin y

 Solution Consider: ∂ f ∂ f = 0 and = 0 ∂x ∂y cos x sin y = 0 and sin x cos y = 0 (cos x = 0 or sin y = 0) and (sin x = 0 or cos y = 0) π π (x = + kπ or y = mπ) and (x = nπ or y = + pπ). 2 2 Here m, n, k, p are any integers. Some logical deductions show that these imply the following: π π x = + kπ and y = + pπ 2 2 or: y = mπ and x = nπ.

Therefore, there are infinitely many critical points:  π π  + kπ, + pπ , (mπ, nπ) 2 2 where m, n, k, p are any integers. See Figure 2.17b for its graph.

(a) graph of z = xy − x2 − y2 − 2x − 2y + 4 (b) graph of z = sin x sin y

Figure 2.17: graphs of functions in Examples 2.8 and 2.9

2.6.2 Second Derivative Test of Multivariable Functions

In single-variable calculus, to determine the nature of a critical point x0 of a function f (x), we 00 look at its second derivative at x0. If f (x0) > 0, the graph y = f (x) is concave up around x0 00 and so (x0, f (x0)) is a local minimum point. On the other hand, if f (x0) < 0, the graph is concave down near x0 and so (x0, f (x0)) is a local maximum point. In multivariable calculus, however, to determine whether a critical point (x0, y0) of a two- variable function f (x, y) is not as simple as in single-variable calculus. Take the following function as an example: f (x, y) = x2 + 4xy + y2. One can easily verify that ∇ f (0, 0) = 0 and so (0, 0) is a critical point. For the second derivatives, we find that:

fxx(0, 0) = 2 fyy(0, 0) = 2 38 Partial Differentiations for every (x, y) on the R2 plane. Both are positive numbers. You may be tempted to conclude that (0, 0) is a local maximum point. However, if one plots the graph of this function (see Figure 2.18), one can see easily that (0, 0) is neither a local maximum or a local minimum.

Figure 2.18: (0, 0) is neither a maximum or minimum

Around (0, 0), the graph is a concave up in some directions but concave down in other directions. We call this (0, 0) a saddle. This example shows the signs of fxx and fyy alone could not conclude the nature of the critical point. In fact, the second derivative test for two-variable functions is slightly more complicated than that in single-variable calculus: Theorem 2.5 — Second Derivative Test for Two-Variable Functions. Let f (x, y) be a twice dif- ferentiable function and (x0, y0) is a critical point of f , i.e. ∇ f (x0, y0) = 0. Then the nature of this critical point (x0, y0) is determined by the following table:   2 fxx fyy − fxy fxx(x0, y0) (x0, y0) is a: (x0,y0) > 0 > 0 local minimum > 0 < 0 local maximum < 0 anything saddle Any other cases are inconclusive.

For the function f (x, y) = x2 + 4xy + y2 in the above example, to determine the nature of (0, 0) we also need fxy(0, 0), which can be found as equal to 4. Therefore, we have:   2 2 fxx fyy − fxy = 2 × 2 − 4 < 0, (0,0)

fxx(0, 0) = 2 > 0.

From the table in Theorem 2.5, we conclude (0, 0) is a saddle, as expected from the plot of the its graph. Let’s look at one more example before we learn the proof of the Second Derivative Test.

2 3 2  Example 2.10 Let f (x, y) = 3y − 2y − 3x + 6xy. Find all critical points and determine the nature of each of them.

 Solution To find all critical points, we set:

∂ f = −6x + 6y = 0, ∂x ∂ f = 6y − 6y2 + 6x = 0. ∂y

From the first equation, we get y = x. Substitute this into the second equation, we yield:

6x − 6x2 + 6x = 0, or equivalently 2x − x2 = 0. 2.6 Local Extrema 39

By factorization, we get x(2 − x) = 0. Therefore

x = 0 or x = 2.

By noting that y = x, we have two critical points: (0, 0) and (2, 2). Next we compute the second derivatives of f :

fxx = −6 fxy = 6

fyx = 6 fyy = 6 − 12y

 2  Critical point P fxx(P) fyy(P) fxy(P) fxx fyy − fxy (P) Nature of P (0, 0) -6 6 6 -72 saddle (2, 2) -6 -18 6 72 local maximum

Explanation of the Second Derivative Test In single-variable, the second derivative test can be explained using convexity of the graph y = f (x). However, this approach can hardly be generalized to higher dimensions. Before we explain why the above second derivative test works for two-variable functions f (x, y), we first seek an alternative explanation of the single-variable second derivative test using Taylor’s . Recall that the Taylor’s series of a given function f (x) about x = a is given by:

f 00(a) f 000(a) f (x) = f (a) + f 0(a)(x − a) + (x − a)2 + (x − a)3 + ... 2! 3!

If f (x) has a critical point at x = a, then f 0(a) = 0. Also, when x is very close to a, the higher-order terms (x − a)3, (x − a)4, etc. are significantly smaller than the quadratic term (x − a)2. Therefore, the function f (x) is approximately given by:

f 00(a) f (x) ' f (a) + (x − a)2 when x is near a. 2!

00 f (a) 2 00 The right-hand side f (a) + 2! (x − a) is a quadratic function. If f (a) > 0, then the graph 00 00 f (a) 2 f (a) 2 y = f (a) + 2! (x − a) is a concave up and so f (a) + 2! (x − a) ≥ f (a). Therefore, 00 f (a) 2 f (x), which is approximately f (a) + 2! (x − a) , is also ≥ f (a) when x is near a. This explains f (x) has a local minimum at x = a. 00 00 f (a) 2 On the other hand, if f (a) < 0, then the graph y = f (a) + 2! (x − a) is a concave down parabola. Similar argument as above shows f (x) has a local maximum at x = a.

10

8

6

4

-1.0 -0.5 0.0 0.5 1.0

Figure 2.19: blue graph shows y = f (x) where f 0(0) = 0; yellow graph shows y = f (0) + 00 f (0) 2 00 2! x where f (0) > 0

Back to multivariable calculus, we now explain the second derivative test using the Taylor’s series approach. Given a function f (x, y), the multivariable Taylor’s series about (x, y) = 40 Partial Differentiations

(x0, y0) is given by:

f (x, y) = f (x0, y0) + fx(x0, y0)(x − x0) + fy(x0, y0)(y − y0)

f (x , y ) 2 fxy(x0, y0) fyy(x0, y0) + xx 0 0 (x − x )2 + (x − x )(y − y ) + (y − y )2 2! 0 2! 0 0 2! 0 + higher-order terms

The proof is beyond the scope of the course. If (x0, y0) is a critical point of f (x, y), then fx(x0, y0) = fy(x0, y0) = 0. For simplicity, denote P = (x0, y0), then when (x, y) is near P, we have: 1   f (x, y) ' f (x , y ) + f (P)(x − x )2 + 2 f (P)(x − x )(y − y ) + f (P)(y − y )2 . 0 0 2 xx 0 xy 0 0 yy 0

Therefore, to determine whether (x0, y0) is a local maximum/minimum or a saddle of f (x, y), one should determine whether the quadratic

2 2 fxx(P)(x − x0) + 2 fxy(P)(x − x0)(y − y0) + fyy(P)(y − y0) is positive/negative or neither. For simplicity, denote

A = fxx(P), B = fxy(P), C = fyy(P),

X = x − x0, Y = y − y0. Then, the quadratic expression can be simplified as:

2 2 2 2 fxx(P)(x − x0) + 2 fxy(P)(x − x0)(y − y0) + fyy(P)(y − y0) = AX + 2BXY + CY .

To determine whether AX2 + 2BXY + CY2 is always positive/negative or neither, one looks the ∆ = (2B)2 − 4AC = 4(B2 − AC): AC − B2 A AX2 + 2BXY + CY2 near P, f (x, y) is > 0 > 0 ≥ 0 ≥ f (P) > 0 < 0 ≤ 0 ≤ f (P) < 0 anything can be either + or − can be either ≥ f (P) or ≤ f (P)

Translate back to previous notations, we can conclude:   2 fxx fyy − fxy fxx(x0, y0) (x0, y0) is a: (x0,y0) > 0 > 0 local minimum > 0 < 0 local maximum < 0 anything saddle This explains the second derivative test for two-variable functions! 2.7 Lagrange’s Multiplier 41 2.7 Lagrange’s Multiplier In the previous section, we learned how to find critical points in the interior of a domain, namely by solving ∇ f = 0. These critical points are candidates of the maximum or minimum of the function. We also learn how to determine the local nature of the critical points. However, to determine the maximum/minimum on the boundary of a domain, the gradient method does not as the tangent plane at the maximum/minimum needs not be horizontal. When the domain of a function f (x, y) is restricted on a level set such as x2 + y2 = 1, which is a unit circle, we use a method called the Lagrange’s Multiplier. We first state the method, then look at a few examples, and finally explain why it works. Given a function f (x, y) which we want to maximize or minimize, and the variables (x, y) are restricted by the constraint g(x, y) = c. Then, to determine all possible candidates of maximum/minimum point on the constraint, we: 1. Solve the system of equations

∇ f (x, y) = λ∇g(x, y) g(x, y) = c

Here the unknowns are x, y and λ. 2. The solutions of (x, y) are the possible candidates of the maximum or minimum points of the function f (x, y). We call these boundary critical points. 3. Finally, evaluate f (x, y) at each boundary critical point found. The point giving the largest value of f (x, y) is the maximum point on the boundary, and that giving the smallest value of f (x, y) is the minimum.

i We call this the Lagrange’s Multiplier method because the scalar λ is called the Lagrange’s Multiplier.

2 2  Example 2.11 Let f (x, y) = x + y + 2x + 2y, find the maximum and minimum values of f when (x, y) is restricted on the constraint x2 + y2 = 1.

2 2  Solution f (x, y) is the function we want to maximize and minimize. Let g(x, y) = x + y so that the level set g(x, y) = 1 is our constraint. First we compute:

∇ f = h2x + 2, 2y + 2i ∇g = h2x, 2yi.

Therefore, the vector equation ∇ f (x, y) = λ∇g(x, y) is equivalent to the two equations 2x + 2 = 2λx and 2y + 2 = 2λy. Combining with the constraint equation x2 + y2 = 1, we get a system of three equations with three unknowns x, y and λ:

2x + 2 = 2λx 1

2y + 2 = 2λy 2 2 2 x + y = 1 3

Since we are interested in solving for (x, y) and finding λ is optional, we divide 1 by 2 so that the λ can be canceled. However, we may worry that whether 2 is zero! Therefore, we split into two cases. Case 1: 2y + 2 6= 0 (and so 2λy 6= 0 too) 1 ÷ 2 gives: 2x + 2 2λx = . 2y + 2 2λy 42 Partial Differentiations

After cancellations, we get: x + 1 x = . y + 1 y By cross multiplication:

y(x + 1) = x(y + 1) ⇒ xy + y = xy + x ⇒ y = x.

2 1 1 Substitute y = x into 3 , we have 2x = 1, and so x = √ or − √ . Since y = x, the solutions 2 2 for (x, y) in this case are:

 1 1   1 1  (x, y) = √ , √ , − √ , − √ . 2 2 2 2 Case 2: 2y + 2 = 0 In this case, y = −1. Substitute this into 3 , we get x = 0. However, putting x = 0 into 1 yields 2 = 0 which is absurd! Therefore, there is no solution in this case.     To sum up, the boundary critical points are: √1 , √1 , − √1 , − √1 . Evaluate f (x, y) 2 2 2 2 at each point gives:

 1 1  √ f √ , √ = 1 + 2 2 2 2  1 1  √ f − √ , − √ = 1 − 2 2 2 2   Therefore, subject to the constraint x2 + y2 = 1, √1 , √1 is the maximum point of f with 2 2 √   √ value 1 + 2 2, and − √1 , − √1 is the minimum point of f with value 1 − 2 2. See Figure 2 2 2.20 (the blue circle is the constraint).

1.0 3.82843

1

0.5 2.41421

0.414214

3.12132 0.0

0.292893

-0.5 -1.82843

1.70711

-1.0 -1.12132 -1.0 -0.5 0.0 0.5 1.0 (a) the graph (b) the level set diagram

Figure 2.20: f (x, y) = x2 + y2 + 2x + 2y in Example 2.11 2.7 Lagrange’s Multiplier 43

2 2  Example 2.12 Let f (x, y) = x − 4x + y + 9. Find the maximum and minimum points and values of f (x, y) subject to the constraint 4x2 + 9y2 = 36.

2 2  Solution Define g(x, y) = 4x + 9y , then the constraint is the level set g(x, y) = 36. Set-up the Lagrange’s Multiplier system:

∇ f (x, y) = λ∇g(x, y) g(x, y) = 36

By computing ∇ f and ∇g, the above is equivalent to a system of three equations:

2x − 4 = 8λx 1

2y = 18λy 2 2 2 4x + 9y = 36 3

Case 1: 2 6= 0 By 1 ÷ 2 , we get:

2x − 4 8λx = 2y 18λy x − 2 4x = (cancel λ) y 9y 9y(x − 2) = 4xy (cross multiplication) 9(x − 2) = 4x (cancel y 6= 0) 18 x = 5

18 However, substitute x = 5 into 3 , we get:

 18 2 9y2 = 36 − 4 5

which is a negative number, but 9y2 must be positive (or zero)! Therefore, there is no solution in this case. Case 2: 2 = 0 2 In this case, we have 2y = 18λy = 0, so y = 0. From 3 , we have 4x = 36 and so x = 3 or x = −3. Therefore, (x, y) = (3, 0) and (x, y) = (−3, 0) are the solutions in this case. Finally, we evaluate f at each boundary critical point:

f (3, 0) = 6 f (−3, 0) = 30

Therefore, minimum point is (3, 0) with value 6; maximum point is (−3, 0) with value 30. See Figure 2.21 972 Chapter 13 • Functions of Several Variables 13.9 Lagrange Multipliers

44 PartialOne of Differentiations many challenges in and marketing is predicting the behavior of consumers. Basic models of consumer behavior often involve a utility function that expresses consumers’ combined preference for several different amenities. For example, a simple utility function might have the form U = f 1/, g2, where / represents the amount of leisure time and g represents the number of consumable goods. The model assumes that consumers try to maximize their utility function, but they do so under certain constraints Find the maximum and minimum on the variables of the problem. For example, increasing leisure time may increase utility, values of z as (x, y) varies over C. but leisure time produces no income for consumable goods. Similarly, consumable goods z may also increase utility, but they require income, which reduces leisure time. We first develop a general method for solving such constrained optimization problems and then return to economics problems later in the section.

The Basic Idea (a) the graph 2 30 26 18 10 We start with a typical constrained optimization problem with two independent variables z f (x, y) and give its method of solution; a generalization to more variables then follows. We seek maximum and/or minimum values of a f (the objective function) 1 y with the restriction that x and y must lie on a constraint curve C in the xy-plane given by C 1 2 = Constraint g(x, y) 0 g x, y 0 (Figure 13.94). x 0 curve 6 The problem and a method of solution are easy to visualize if we return to Example 6 1 2 = FIGURE 13.94 of Section 13.8. Part of that problem was to find the maximum value of f x, y x 2 + y 2 - 2x + 2y + 5 on the circle C: 51x, y2: x 2 + y 2 = 46 (Figure 13.95a). In -1 Figure 13.95b we see the level curves of f and the point P1- 12, 122 on C at which f has a maximum value. Imagine moving along C toward P; as we approach P, the values -2 30 22 14 of f increase and reach a maximum value at P. Moving past P, the values of f decrease. -3 -2 -1 0 1 2 3 (b) the level set diagram z f (x, y) x2 y2 2x 2y 5 y Level curves of f Figure 2.21: f (x, y) = x2 − 4x + y2 + 9 in Example 2.12

Maximum value of f on 2 The equation ∇ f = λ∇g concerns about the geometry of the level sets and the constraint C occurs at P(͙2, ͙2). at the boundary critical point. Given a function f (x, y) subject to the constraint g(x, y) = c. At the point (a, b) on the constraint where the maximum or minimum of f (x, y) is achieved, the 2 1 4 x level set of f (x, y) at (a, b) is tangent to the constraint g(x, y) = c. Consequently, the gradient vector ∇ f (a, b), which is perpendicular to the level set of f at (a, b), must be parallel to the gradient vector ∇g(a, b), which is perpendicular to the constraint g = c. See Figure 2.22 for an 2 illustration. Therefore, at such a point, we must have: Constraint curve C: {(x, y): x2 y2 4} ∇ f (a, b) = λ∇g(a, b) 4 x y C: {(x, y): x2 y2 4} where λ is a scalar. (a) (b) FIGURE 13.95 f (a, b) y Level curves of f g(a, b) Figure 13.96 shows what is special about the point P. We already know that at any point P1a, b2, the line tangent to the level curve of f at P is orthogonal to the gradient P(a, b) 1 2 f a, b (Theorem 13.12). We also see that the line tangent to the level curve at P is tan- gent to the constraint curve C at P. We prove this fact shortly. Constraint x Furthermore, if we think of the constraint curve C as just one level curve of the func- Tangent curve tion z = g1x, y2, then it follows that the gradient g1a, b2 is also orthogonal to C at to C C: g(x, y) 0 1a, b2, where we assume that g1a, b2 ϶ 0 (Theorem 13.12). Therefore, the gradients at (a, b) f (a, b) is parallel 1 2 1 2 f a, b and g a, b are parallel. These properties characterize the point P at which to g(a, b) at P(a, b). f has an extreme value on the constraint curve. They are the basis for the method of FIGURE 13.96 Lagrange multipliers that we now formalize. Figure 2.22: ∇ f and ∇g are parallel at the boundary critical point (a, b) 2.7 Lagrange’s Multiplier 45

The Lagrange’s Multiplier method also works for three-variable functions, yet the system of equations may be more complicated. Let’s look at the example:

 Example 2.13 Find the distance from (0, 0, 0) to the plane 2x + 3y + 4z = 29 using La- grange’s Multiplier.

 Solution The distance from a point P to a plane is defined to be the shortest possible distance between the given point P and any point Q on the plane. Let’s first formulate this problem in a mathematical way. We want to: q minimize x2 + y2 + z2 subject to constraint 2x + 3y + 4z = 29

p p  However, to minimize x2 + y2 + z2 amounts to calculating ∇ x2 + y2 + z2 . As you p can imagine, it would be messy. It is useful to observe that minimizing x2 + y2 + z2 is equivalent to minimizing x2 + y2 + z2, i.e. the square of the distance from the origin. The latter is much easier to handle. Let:

f (x, y, z) = x2 + y2 + z2 g(x, y, z) = 2x + 3y + 4z,

then the constraint is the level set g = 29. Set-up the Lagrange’s Multiplier system ∇ f = λ∇g as in previous examples:

2x = 2λ 2y = 3λ 2z = 4λ 2x + 3y + 4z = 29

3λ Then x = λ, y = 2 and z = 2λ. Substitute them into the constraint equation, we get: 9λ 2λ + + 8λ = 29. 2 It is easy to see that λ = 2, and therefore (x, y, z) = (2, 3, 4). It gives the unique critical point. It is intuitive that the minimum point must exist in this problem (and there is no maximum point), so this unique critical point must√ give the minimum point. Since f (2, 3, 4) = 29, the distance from (0, 0, 0) to the plane is 29. 46 Partial Differentiations 2.8 Optimizations In this section we see more optimization examples of using the gradient and/or Lagrange’s Multiplier method.

 Example 2.14 Many airlines require that the sum of length, width and height of a checked baggage cannot exceed 62 inches. Find the dimensions of the rectangular baggage that has the greatest possible volume under this regulation.

 Solution Denote l, w, h to be the length, width and height respectively. We need to maximize the volume of the baggage, which is given by:

V(l, w, h) = lwh (cubic inches).

The constraint is l + w + h ≤ 62 (inches), but it is intuitively clear that in order to maximize the volume, the sum l + w + h has better be at maximum possible. Define g(l, w, h) = l + w + h, then the constraint can be regarded as the level set g = 62. Set up the Lagrange’s Multiplier system:

∇V = λ∇g g(l, w, h) = 62

which is equivalent to

wh = λ lh = λ lw = λ l + w + h = 62

Although it is not too difficult to solve them by hand, let’s type the following command on Mathematica to solve them: Solve[{w h == L, l h == L, l w == L, l + w + h == 62}, {l, w, h, L}]

These are all critical points:

 62 62 62  (l, w, h) = , , , (0, 0, 62), (0, 62, , 0), (62, 0, 0). 3 3 3

Only the first one is physically relevant. Therefore, the rectangular baggage with the largest volume under this restriction is the square cube!

 Example 2.15 Three cities A, B and C are located at (5, 2), (−4, 4) and (−1, −3) respectively on the (x, y)-plane. There is a railtrack whose equation is y = x3 + 1, and a station is going to be built on the track so that the sum of squares of the distances from each city to the station is minimized. Find the coordinates of the station.

 Solution Quantity to be minimized is

f (x, y, z) = (x − 5)2 + (y − 2)2 + (x + 4)2 + (y − 4)2 + (x + 1)2 + (y + 3)2 . | {z } | {z } | {z } distance2 from station to city A distance2 from station to city B distance2 from station to city C

The constraint is that the station has to be on the track, i.e. y = x3 + 1. Define g(x, y) = y − x3, then the constraint can be written as g(x, y) = 1. Set up the Lagrange’s Multiplier system 2.8 Optimizations 47

∇ f = λ∇g and g(x, y) = 1:

2(x − 5) + 2(x + 4) + 2(x + 1) = −3λx2 2(y − 2) + 2(y − 4) + 2(y + 3) = λ y − x3 = 1

Solving the system, we get (x, y) = (0, 1). Therefore, the station should be located at (0, 1) in order to minimize the sum of squares of the distances.

 Example 2.16 — Least Square Approximation. Given a set of data points:

(x1, y1),..., (xN, yN)

on the xy-plane. Find the straight-line y = mx + c such that the sum of squares of distances between each (xi, yi) and (xi, mxi + c) is minimized.

 Solution The quantity to be minimized is:

N 2 f (m, c) = ∑(yi − mxi − c) . i=1

Note that (xi, yi)’s are given so they should be regarded as constants. The variables are m and c. Note that there is no constraint for m and c, so we can simply solve ∇ f (m, c) = 0 for critical points.

N N N N ! ∂ f 2 = −2 ∑(yi − mxi − c)xi = −2 ∑ xiyi − m ∑ xi − c ∑ xi ∂m i=1 i=1 i=1 i=1 ! ∂ f N N N = −2 ∑(yi − mxi − c) = −2 ∑ yi − m ∑ xi − cN ∂c i=1 i=1 i=1

∂ f ∂ f Set ∂m = ∂c = 0, regarding all xi’s and yi’s to be constants, then: Am + Bc = E Bm + Nc = F

N 2 n N N where A = ∑i=1 xi , B = ∑i=1 xi, E = ∑i=1 xiyi and F = ∑i=1 yi. By solving the system carefully, one should get:

BF − EN (∑ x )(∑ y ) − N (∑ x y ) m = = i i i i B2 − AN 2 2 (∑ xi) − N ∑ xi  BE − AF (∑ x )(∑ x y ) − ∑ x2 (∑ y ) c = = i i i i i B2 − AN 2 2 (∑ xi) − N ∑ xi It is quite intuitive that this pair of m and c should minimize f since f ≥ 0 and so a minimum must exist. 48 Partial Differentiations

2 2  Example 2.17 Let f (x, y) = x − 4x + y + 9 (which was considered in Example 2.12 in the previous section). Find the absolute maximum and absolute minimum of f restricted to the domain 4x2 + 9y2 ≤ 36.

 Solution The Lagrange’s Multiplier method finds us the boundary critical points on 4x2 + 9y2 = 36. For the interior 4x2 + 9y2 < 36, the critical points are simply solutions to ∇ f = 0. The general procedure of an optimization problem with a solid domain is that: 1. Find all interior critical points by solving ∇ f = 0; 2. Find all boundary critical points using Lagrange’s Multiplier; 3. Evaluate f at each critical points found, and look for the point that gives greatest/lowest value of f . Interior: Set ∇ f = 0, we get:

2x − 4 = 0 2y = 0

Therefore, the only interior critical point is (2, 0), which can be checked easily that it is in the given region. Boundary: We have already done in Example 2.12 that the boundary critical points are (3, 0) and (−3, 0). Finally, evaluate f at each critical point found:

f (2, 0) = 5 f (3, 0) = 6 f (−3, 0) = 30

Therefore, the absolute minimum is 5 (attained at (2, 0)), and the absolute maximum is 30 (attained at (−3, 0)). 3 — Multiple Integrations

3.1 Double Integrals in Rectangular Coordinates In single-variable calculus, integration is used to find the area of the graph of a function f (x). In this chapter, we will generalize the concept of integrations to multivariable functions. There are many applications of multiple integrals in sciences, including deriving moments of inertia, probability, and in later part of the course: finding and surface flux. Computations of multivariable integrals are not much different from those in single-variable integrals, but setting up a multivariable integral involves a lot more geometric intuitions. Let’s first look at some computations first before we explain the geometry of these inte- grals.

 Example 3.1 Compute the following double integral:

y=2 x=1 (4 − x − y2x)dxdy. ˆy=1 ˆx=0

 Solution A double integral consists of an inner integral and an outer integral:

outer z }| { y=2 x=1 (4 − x − y2x)dx dy . ˆy=1 ˆx=0 | {z } inner

When computing the inner integral (which is respect to x in this example), we regard all other variable(s) (i.e. y) to be constant(s):

x=1 y=2 x=1 y=2  x2 y2x2  (4 − x − y2x)dxdy = 4x − − dy ˆy=1 ˆx=0 ˆy=1 2 2 x=0 y=2  1 y2   = 4 − − − 0 dy ˆy=1 2 2 y=2 y=2  7 y2   7y y3  7 = − dy = − = . ˆy=1 2 2 2 6 y=1 3

It is worthwhile to note that if we switch the inner and outer integrals, the final answer is the 50 Multiple Integrations same!

y=2 x=1 y=2 y=2  y3x  (4 − x − y2x)dydx = 4y − xy − dx ˆx=0 ˆy=1 ˆy=1 3 y=1 x=1  8x   x  = 8 − 2x − − 4 − x − dx ˆx=0 3 3 14.1 Double Integrals over Rectangular Regions 987 x=1  10x  7 = 4 − dx = . The functions that we encounter in this book are integrable. Advanced methods are ˆ 3 3 x=0 needed to prove that continuous functions and many functions with finite discontinuities It is not a coincident! Let’s explain why it is true by learning the geometricare also integrable. meaning of double integrals. Consider the integral: Iterated Integrals x=b y=d Evaluating double integrals using limits of Riemann sums is tedious and rarely done. For- f (x, y) dydx. tunately, there is a practical method that reduces a double integral to two single (one- ˆx=a ˆy=c variable) integrals. An example illustrates the technique. ➤ The inner integral: Recall the General Slicing Method. If a Suppose we wish to compute the volume of the solid region bounded by the plane solid is sliced parallel to the y-axis and = 1 2 = - - = 51 2 … … y=d z f x, y 6 2x y over the rectangular region R x, y : 0 x 1, perpendicularA(x) := to the xy-plane,f (x, y and) dy the 0 … y … 26 (Figure 14.5). By definition, the volume is given by the double integral cross-sectionalˆ areay=c of the slice at the point x is A1x2, then the volume of = 1 2 = 1 - - 2 V f x, y dA 6 2x y dA. is an integral with respect to y thekeeping solid regionx fixed.is This quantity represents the area underO O R R the curve obtained by moving along the y-directionb from y = c to y = d on the surface z = f (x, y), while keeping x unchanged.V See= FigureA1x2 dx 3.1.. According to the General Slicing Method (Section 6.3), we can compute this volume by La taking slices through the solid parallel to the y-axis and perpendicular to the xy-plane (Fig- ure 14.5). The slice at the point x has a cross-sectional area denoted A1x2. In general, as x varies, the area A1x2 also changes, so we integrate these cross-sectional from x = 0 to x = 1 to obtain the volume 1 V = A1x2 dx. L0 y The important observation is that for a fixed value of x, A1x2 is the area of the plane region under the curve z = 6 - 2x - y. This area is computed by integrating f with re- spect to y from y = 0 to y = 2, holding x fixed; that is, 2 A1x2 = 16 - 2x - y2 dy, L0 where 0 … x … 1, and x is treated as a constant in the integration. Substituting for A1x2, we have z Figure 3.1: geometric meaning of a double integral 1 1 2 V = A1x2 dx = c 16 - 2x - y2 dy d dx. L0 L0 L0

The outer integral integrates the inner integral A(x) from x = a to x = b, i.e. i y A1x2 x=b y=d x=b f (x, y) dydx = A(x) dx. The expression that appears on the right side of this equation is called an iterated ˆx=a ˆy=c ˆx=a integral (meaning repeated integral). We first evaluate the inner integral with respect to y holding x fixed; the result is a function of x. Then the outer integral is evaluated with re- Since A(x) dx can be thought as the volume of a solid slice with widthspect todx x;and the result cross-section is a real number, which is the volume of the solid in Figure 14.5. Both area A(x), by integrating A(x) dx it means adding up the volumethese of these integrals thin are slices ordinary and soone-variable integrals. the double integral 1 x=b EXAMPLE 1 Evaluating an iterated integral Evaluate V = 1 A1x2 dx, where 2 0 A(x) dx A1x2 = 16 - 2x - y2 dy. ˆx=a 10 is the volume under the graph z = f (x, y) over the base rectangleSOLUTION bounded by Usingx = thea, Fundamentalx = b, Theorem of Calculus, holding x constant, we have y = c and y = d. It is important to understand the geometric meanings of the inner and outer2 1 2 = 1 - - 2 integrals in order to set-up a double integral correctly. A x 6 2x y dy L0 As a double integral represents the volume of a solid,y one should not expect there is any y y 2 2 difference if we slice the solid in a different way. For instance, to find the volume under= a6y the- 2xy - b` Fundamental Theorem of Calculus A slice at a fixed value of x 2 0 has area A(x), where 0 x 1. = 112 - 4x - 22 - 0 Simplify; limits are in y. FIGURE 14.5 = 10 - 4x. Simplify. 14.1 Double Integrals over Rectangular Regions 987

The functions that we encounter in this book are integrable. Advanced methods are needed to prove that continuous functions and many functions with finite discontinuities are also integrable.

Iterated Integrals Evaluating double integrals using limits of Riemann sums is tedious and rarely done. For- tunately, there is a practical method that reduces a double integral to two single (one- variable) integrals. An example illustrates the technique. ➤ Recall the General Slicing Method. If a Suppose we wish to compute the volume of the solid region bounded by the plane solid is sliced parallel to the y-axis and = 1 2 = - - = 51 2 … … z f x, y 6 2x y over the rectangular region R x, y : 0 x 1, perpendicular to the xy-plane, and the 0 … y … 26 (Figure 14.5). By definition, the volume is given by the double integral cross-sectional area of the slice at the point x is A1x2, then the volume of = 1 2 = 1 - - 2 V f x, y dA 6 2x y dA. the solid region is O O R R b 3.1 Double Integrals in RectangularV = A1x2 dx. CoordinatesAccording to the General Slicing Method (Section 51 6.3), we can compute this volume by La taking988 slices throughChapter the 14 solid • parallelMultiple to Integrationthe y-axis and perpendicular to the xy-plane (Fig- graph z = 6 − 2x − y over the rectangular regionure 0 ≤ 14.5).x ≤ 1The and slice 0 ≤at ythe≤ point2, one x has can a cross-sectional set up the area denoted A1x2. In general, double integral in either way: as x varies, the area A1x2 also changes, so weSubstituting integrate A these1x2 = cross-sectional10 - 4x into the areas volume from integral, we have x = 0 to x = 1 to obtain the volume 1 = = x 1 y 2 1 V = A1x2 dx (6 − 2x − y) dy dx see Figure 3.2a V = A1x2 dx. L0 ˆx=0 ˆy=0 L 0 1 | {z } y A(x) The important observation is that for a fixed value of x, A1x2 is= the area110 of- the4x 2plane dx Substitute for A1x2. = - - L0 y=2 x=1 region under the curve z 6 2x y. This area is computed by integrating f with re- = = (6 − 2x − y) dx dy spect to y from ysee Figure0 to y 3.2b2, holding x fixed; that is, 1 2 ˆy=0 ˆx=0 2 = 110x - 2x 2` Fundamental Theorem | {z } A1x2 = 16 - 2x - y2 dy, 0 A(y) L0 = 8. Simplify.

where 0 … x … 1, and x is treated as a constant in the integration. Substituting for A1x2, Related Exercises 5–25 ➤ we have z z 1 1 EXAMPLE2 2 Same double integral, different order Example 1 used slices through V = A1x2 dx = cthe solid16 -parallel2x - toy2 the dy dy -axis.dx. Compute the volume of the same solid using slices y L0 L0 throughL0 the solid parallel to the x-axis and perpendicular to the xy-plane, for 0 … y … 2

(Figure 14.6).i y A1x2 SOLUTION In this case, A1y2 is the area of a slice through the solid for a fixed value of The expression that appears on the righty in side the intervalof this equation0 … y … is2. called This area an iteratedis computed by integrating z = 6 - 2x - y from integral (meaning repeated integral). We firstx =evaluate0 to x =the1, inner holding integral y fixed; with that respect is, to y holding x fixed; the result is a function of x. Then the outer integral is evaluated with1 re- spect to x; the result is a real number, which is the volume of the solid in FigureA1y2 = 14.5. 1Both6 - 2x - y2 dx, these integrals are ordinary one-variable integrals. L0 where 0 … y … 2. = 1 1 2 EXAMPLE 1 Evaluating an iterated integralUsing Evaluate the General V Slicing10 A x Method dx, where again, the volume is 1 2 = 21 - - 2 A x 10 6 2x y dy. 2 V = A1y2 dy General Slicing Method SOLUTION Using the Fundamental Theorem of Calculus, holdingL0 x constant, we have 2 2 1 A1x2 = 16 - 2x - y2 dy = c 16 - 2x - y2 dx d dy Substitute for A1y2. x L0 y L0 L0 2 2 y y y i y = 2a6y - 2xy - b` Fundamental Theorem of CalculusA1y2 A slice at a fixed value of x A slice at a fixed value of y has 2 0 2 1 has area A(x), where 0 x 1. area A(y), where 0 y 2. Fundamental Theorem of Calculus; = 112 - 4x - 22 - 0 Simplify; limits= arec1 in6 xy. - x 2 - yx2` d dy y is constant. FIGURE 14.5 FIGURE 14.6 L0 0 (a) slices with fixed x (b) slices with= 10 fixed- 4x.y Simplify. 2 Figure 3.2: volume under the same graph = 15 - y2 dy Simplify; limits are in x. L0 y 2 2 Readers should verify that the above integrals indeed give the same value (the answer = a5y - b` Evaluate outer integral. is 8). In general, the following Fubini’s Theorem asserts that switching dx and dy (and the 2 0 corresponding integral signs) give the same double integral. Although the statement of the = 8. Simplify.

theorem is geometrically intuitive, the proof is not easy and is beyond the scope of this course. Related Exercises 5–25 ➤

Theorem 3.1 — Fubini’s Theorem for Rectangular Regions. Let f (x, y) be a continuous functionSeveral important comments are in order. First, the two iterated integrals give the over a rectangular region a ≤ x ≤ b and c ≤ y ≤ d, then: same value for the double integral. Second, the notation of the iterated integral must be d b 1 2 d3 b 1 2 4 used carefully. When we write 1c 1a f x, y dx dy, it means 1c 1a f x, y dx dy. The y=d x=b x=b y=d QUICK CHECK 2 Consider the integral inner integral with respect to x is evaluated first, holding y fixed, and the variable runs ( ) = ( ) = = f x, y dxdy 4 2 1 f 2x, y dydx. from x a to x b. The result of that integration is a constant or a function of y, which ˆ ˆ ˆ 1ˆ1 f x, y dx dy. Give the limits of y=c x=a x=a 3 y=1 c is then integrated in the outer integral, with the variable running from y = c to y = d. integration and the variable of The order of integration is signified by the order of dx and dy. integration for the first (inner) integral Since the order of integration (i.e. dxdy or dydx) determines the order of the integral signs, b d 1 2 b3 d 1 2 4

and the second (outer) integral. Sketch Similarly, 1a 1c f x, y dy dx means 1a 1c f x, y dy dx. The inner integral with the above two double integrals can simply be writtenthe as: region of integration. ➤ respect to y is evaluated first, holding x fixed. The result is then integrated with respect to x.

d b b d f (x, y) dxdy = f (x, y) dydx. ˆc ˆa ˆa ˆc

Even simpler, one may denote dA := dxdy or dydx and the rectangular region by R. Then, we can write the integral as: f (x, y) dA. ¨R 52 Multiple Integrations

When setting up a double integral to find the volume of the solid under a graph z = f (x, y), it is worthwhile to observe that the lower and upper limits of the integral are not affected by the function f (x, y). Therefore, in order to interpret a double integral in a geometric way, one may simply draw the base region (or in other words, the top-down view) instead of drawing the solid in the three-dimensional space.

y x = 1 y x = 1

leaves at y = 2 y = 2 y = 2 enters at x = 0 leavs at x = 1

x x enters at y = 0

(a) top-down view of Figure 3.2a (b) top-down view of Figure 3.2b

Figure 3.3: the red arrows represent the cross-section slices in Figures 3.2a and 3.2b. 3.2 Fubini’s Theorem for General Regions 53 3.2 Fubini’s Theorem for General Regions In this section, we demonstrate some examples of double integrals whose base regions may not be rectangles but some general regions. Setting up these double integrals often require a certain degree of geometric intuition. Therefore it is important to see more examples to acquire the necessary skills.

 Example 3.2 Find the volume of the solid under the plane z = 3 − x − y over the triangle region R bounded by the x-axis, x = 1 and y = x.

 Solution First we choose an order of integration, say dydx. The inner integral should calculate the area of slices with y varies and x fixed. Since the height 3 − x − y of the solid does not affect how we set-up the upper/lower limits, we consider the top-down view of the solid (see Figure 3.4). The red strip in Figure 3.4b represents a sample slice with fixed x. The strip enters at y = 0 and leaves at y = x. Hence, the area of this slice is:

y=x (3 − x − y) dy. ˆy=0 | {z } height

“Summing up” the area of these slices, we integrate by dx over the range of x: 0 ≤ x ≤ 1 as shown in Figure 3.4b, i.e. x=1 y=x (3 − x − y) dy dx. ˆx=0 ˆy=0 | {z } inner integral It will give the volume of the solid as required in this problem. The rest of the task is to compute the integral:

y=x x=1 y=x x=1  y2  (3 − x − y) dydx = 3y − xy − dx ˆx=0 ˆy=0 ˆx=0 2 y=0 x=1  x2  = 3x − x2 − dx ˆx=0 2 1  3x2  = 3x − dx ˆ0 2 1  3x2 x3  = − 2 2 0 = 1.

Alernatively, we can also integrate first by dx then by dy. Then, the inner integral is represented by the red strip in Figure 3.4c. It enters at x = y and leaves at x = 1. The double integral is therefore: y=1 x=1 (3 − x − y) dxdy. ˆy=0 ˆx=y Readers should compute as an exercise that the answer is again 1. 54 14.2 Double Integrals over GeneralMultiple Regions Integrations771

z

y x 1 (3, 0, 0) y x

z 5 f(x, y) 5 3 2 x 2 y y x

R x (1, 0, 2) 0 y 0 1

(b)

y x 1

y x (1, 1, 1)

y x y x 1 (1, 0, 0) (1, 1, 0) R R x 5 1 x x 0 1 y 5 x (a) (c)

FIGURE 14.12 (a) Prism with a triangular base in the xy-plane. The volume of this prism is defined Figure 3.4: the graph and the top-down views of the function in Example 3.2 as a double integral over R. To evaluate it as an iterated integral, we may integrate first with respect to y and then with respect to x, or the other way around (Example 1). (b) Integration limits of

x=1 y=x ƒsx, yd dy dx. The fact that we have the freedom toL choosex=0 Ly=0 our order of integration is guaranteed by the Fubini’s Theorem, whose proof is again beyond the scope of this course. If we integrate first with respect to y, we integrate along a vertical line through R and then integrate Theorem 3.2from — Fubini’sleft to right Theorem to include forall the General vertical Regions.lines in R. (c)Let IntegrationR be a limits region of on the xy-plane and f (x, y) is a continuous function on R, then y=1 x=1 ƒsx, yd dx dy. f (x, y)dxdyLy=0 L=x=y f (x, y)dydx ¨R ¨R If we integrate first with respect to x, we integrate along a horizontal line through R and then inte- where the lower/uppergrate from bottom limits to top ofto include each integral all the horizontal are set lines up in according R. to the shape of the region R.

Notation As thereAlthough is no Fubini’s difference Theorem between assures usdxdy thatand a doubledydx integralas far may as thebe calculated upper and as an lower limits are setiterated according integral toin theeither same order region, of integration, we may the simply value of write: one integral may be easier to find than the value of the other. The next example shows how this can happen. dA = dxdy or dydx EXAMPLE 2 Calculate Although choosing the dxdy-order will yield the same result as the dydx-order, it happens sin x often that one order is easier while the other onedA is, harder. Let’s look at the following 6 x example: R where R is the triangle in the xy-plane bounded by the x-axis, the line y = x, and the line  Example 3.3 Let R be the region in the first quadrant of the xy-plane bounded by the unit = circle x2 + yx2 =1.1 and the straight-line x + y = 1. Evaluate the integral

p 1 − x2dA. ¨R 772 Chapter 14: Multiple Integrals

y Solution The region of integration is shown in Figure 14.13. If we integrate first with re- x 1 spect to y and then with respect to x, we find 3.2 Fubini’s Theorem for General Regionsy x 55 1 1 x 1 y=x 1 a sin x b = a sin x d b = x dy dx y x dx sin xdx L√0 L0 L0 y=0 L0 2  Solution First choose the order of integration. However, it seems like integrating 1 − x =-cos s1d + 1 L 0.46. by dx involves trig substitutions that we wantR to avoid if possible. Let’s try integrating first x by dy then by dx (to see if there is any01 luck). If we reverse the order of integration and attempt to calculate Set-up the double integral according the top-down view of the solid (Figure 3.5): 1 1 sin x FIGURE 14.13√ The region of integration x dx dy, L0 Ly x=in1 Exampley= 1 −2.x2 p 2 1 − x dydx. we run into a problem because sssin xd>xd dx cannot be expressed in terms of elemen- ˆx= ˆy= −x 1 0 1 tary functions (there is no simple ). There is no general rule for predicting which order of integration will be the good one As x is regarded as a constant wheny dealing with the inner integral, we can easily see that: in circumstances like these. If the order you first choose doesn’t work, try the other. Some- √ √ 2 x2 y2 1 x=1 y= 1−x p 1 x=1 h p iy= 1−x2 times neither order will work, and then we need to use numerical approximations. 1 − x2 dydx =R y 1 − x2 dx ˆx=0 ˆy=1−x ˆx=0 y=1−x Finding Limits of Integration x=1 2 p We now give a procedure for finding limits of integration that applies for many regions in x =y 1 (1 − x ) − (1 − x) 1 − x2 dx ˆx=0 the plane. Regions that are more complicated, and for which this procedure fails, can often x 011 p pbe split up into pieces on which the procedure works. = (1 − x2 − 1 − x2 + x 1 − x2) dx ˆ s d (a)0 Using Vertical Cross-sections When faced with evaluating 4R ƒ x, y dA, integrating first with respect to y and then with respect to x, do the following three steps: The only two difficult parts are y Leaves at 1. Sketch. Sketch the region of integration and label the bounding curves (Figure 14.14a). 1 y 11 x2 p 1 2 p 2 1 − x dxR and x 1 − x dx. 2. Find the y-limits of integration. Imagine a vertical line L cutting through R in the di- ˆ0 ˆ0Enters at rection of increasing y. Mark the y-values where L enters and leaves. These are the y 1 x y-limits of integration and are usually functions of x (instead of constants) (Figure The former can be evaluated by substitution x = sin θ, while the latter can be done by 14.14b). substituting u = 1 − x2. Readers shouldL complete the rest of computations as an exercise. − π x 3. Find the x-limits of integration. Choose x-limits that include all the vertical lines The final answer should be 1 4 . 01x We need a trig substitution anyway, but it is easier than doing a substitutionthrough forR the. The inner integral shown here (see Figure 14.14c) is integral. x=1 y= 21-x2 (b) ƒsx, yd dA = ƒsx, yd dy dx. 6 Lx=0 Ly=1-x R y Leaves at Using Horizontal Cross-sections To evaluate the same double integral as an iterated in- 2 y 1 x tegral with the order of integration reversed, use horizontal lines instead of vertical lines in 1 R Enters at Steps 2 and 3 (see Figure 14.15). The integral is

y 1 x 1 21-y 2 ƒsx, yd dA = ƒsx, yd dx dy. L 6 L0 L1-y R x 01x Largest y y x x Smallest Largest is y 1 Enters at is x 0 is x 1 x 1 y 1 R (c) Figure 3.5: top-down view of the region in Example 3.3 y FIGURE 14.14 Finding the limits of Smallest y Leaves at 2 In the previous example, we seeintegration that although when integrating the Fubini’s first with Theorem tells that in theory we is y 0 x 1 y x can choose our favorite the order ofrespect integration, to y and inthen practice with respectwe sometimesto x. have to make a smart 01 choice. In the next example, let’s demonstrate an example that one order gives an integral which is ridiciously hard to compute, while another is extremely easy. FIGURE 14.15 Finding the limits of integration when integrating first with  Example 3.4 Evaluate the following double integral: respect to x and then with respect to y. 1 1 sin x dxdy. ˆ0 ˆy x 56 Multiple Integrations

sin x  Solution The integrand x does not have a simple antiderivative when integrating by dx! Let’s switch the order of integration first. The region corresponds to the double integral is formed by strips entering at x = y and leaving at x = 1. The range of y is from 0 to 1. A sketch of the diagram can be found in Figure 3.6. Switching the order of integration, the strip for each x enters at y = 0 and leaves at y = x. Therefore, Fubini’s Theorem says:

1 1 sin x 1 x sin x dxdy = dydx. ˆ0 ˆy x ˆ0 ˆ0 x

The RHS is much easier to compute:

= 1 x sin x 1  sin x y x dydx = y dx ˆ0 ˆ0 x ˆ0 x y=0 1  sin x  = x − 0 dx ˆ0 x 1 = sin x dx = − cos 1. ˆ 772 Chapter0 14: Multiple Integrals

y Solution The region of integration is shown in Figure 14.13. If we integrate first with re- x 1 spect to y and then with respect to x, we find y x 1 1 x 1 y=x 1 a sin x b = a sin x d b = x dy dx y x dx sin xdx L0 L0 L0 y=0 L0 =- s d + L R cos 1 1 0.46. x 01 If we reverse the order of integration and attempt to calculate 1 1 sin x FIGURE 14.13 The region of integration x dx dy, Figure 3.6: the region of integration in Example 3.4 L0 Ly in Example 2. ss d> d If the region of integration is the shaded triangle below, which order of integrationwe run into is better? a problem because 1 sin x x dx cannot be expressed in terms of elemen- tary functions (there is no simple antiderivative). There is no general rule for predicting which order of integration will be the good one f (x, y) dxdyy or f (x, y) dydx, ¨R ¨R in circumstances like these. If the order you first choose doesn’t work, try the other. Some- 1 x2 y2 1 times neither order will work, and then we need to use numerical approximations. where f (x, y) is not very complicated, say f (x, y) = x2y. R Finding Limits of Integration y y = x We now give a procedure for finding limits of integration that applies for many regions in x y 1 the plane. Regions that are more complicated, and for which this procedure fails, can often x 01 be split up into pieces on which the procedure works. s d (a) Using Vertical Cross-sections When faced with evaluating 4R ƒ x, y dA, integrating first with respect to y and then with respect to x, do the following three steps: y y = 1 Leaves at x 1. Sketch. Sketch the region of integration and label the bounding curves (Figure 14.14a). 2 1 y 1 x R y = 8 − x 2. Find the y-limits of integration. Imagine a vertical line L cutting through R in the di- Enters at rection of increasing y. Mark the y-values where L enters and leaves. These are the y 1 x y-limits of integration and are usually functions of x (instead of constants) (Figure L 14.14b). x 3. Find the x-limits of integration. Choose x-limits that include all the vertical lines 01x through R. The integral shown here (see Figure 14.14c) is

x=1 y= 21-x2 (b) ƒsx, yd dA = ƒsx, yd dy dx. 6 Lx=0 Ly=1-x R y Leaves at Using Horizontal Cross-sections To evaluate the same double integral as an iterated in- 2 y 1 x tegral with the order of integration reversed, use horizontal lines instead of vertical lines in 1 R Enters at Steps 2 and 3 (see Figure 14.15). The integral is

y 1 x 1 21-y 2 ƒsx, yd dA = ƒsx, yd dx dy. L 6 L0 L1-y R x 01x Largest y y x x Smallest Largest is y 1 Enters at is x 0 is x 1 x 1 y 1 R (c) y FIGURE 14.14 Finding the limits of Smallest y Leaves at 2 integration when integrating first with is y 0 x 1 y x respect to y and then with respect to x. 01

FIGURE 14.15 Finding the limits of integration when integrating first with respect to x and then with respect to y. 3.3 Double Integrals in Polar Coordinates 57 3.3 Double Integrals in Polar Coordinates When the region of integration is circular in shape, or the integrand is rotationally symmetric, it is often more convenient to use polar coordinates to set-up the integral instead of the using the rectangular coordinates as in the previous section. We will see in some examples in this section that some tedious trig substitution can be avoid if polar coordinates are used. Recall that the polar coordinates (r, θ) and the rectangular coordinates (x, y) are related by the following rules:

x = r cos θ y = r sin θ

Here r is the distance from the point to the origin, and θ is the angle made with the positive x-axis. In rectangular coordinates, the region defined by inequalities like a ≤ x ≤ b and c ≤ y ≤ d, where a, b, c and d are constants, describe a rectangle. We have seen that it is very easy to set-up a double integral when the region is a rectangle. In polar coordinates, regions defined by inequalities like a ≤ r ≤ b and α ≤ θ ≤ β, where a, b, α and β are constants, describe a fan shape or a circular sector (see Figure 3.7). It is wise to use polar coordinates instead of rectangular coordinates when the region of integration is 1006 Chapter 14 • Multiplegiven by Integration one of these circular shapes.

y y y Examples of polar rectangles b

R R R b b ␤ ␣ ␤ a ␣ 0 b x x x

0 r b 0 r b a r b 0 ␪ q ␣ ␪ ␤ ␣ ␪ ␤

FIGURE 14.28 Figure 3.7: examples of regions good for polar coordinates R {(r, ␪): a r b, ␣ ␪ ␤} Our approach is to divide 3a, b4 into M subintervals of equal length r = 1b - a2>M. ␪ ␤ To set-upWe similarly a double divide integral 3a, b4 of into a regionm subintervalsa ≤ r ≤of bequaland lengthα ≤ θ≤u =β 1usingb - a2> polarm. Now coordinates, (r *, ␪ *) k k A the upper/lowerlook at the limits arcs of are the simply: circles centered at the origin with radii k r = a, r = a + r, r = a + 2r, c, r = b ␪ ␣ θ=β r=b r=b θ=β or and the rays ˆ ˆ ˆ ˆ r b θ=α r=a r=a θ=α u = a, u = a + u, u = a + 2u, c, u = b ␪ rdepending on the order of integration drdθ or dθdr. However, one should be very cautious that emanating from the origin (Figure 14.29). These arcs and rays divide the region R into r a while dA = dxdy in rectangular coordinates, it is instead: n = Mm polar rectangles that we number in a convenient way from k = 1 to k = n. The ␪ 0 area of the kth rectangle is denoted A , and we let 1r *, u*2 be an arbitrary point in that 0 dA = rdrdk θ or rdθkdr k FIGURE 14.29 rectangle. 1 * u*2 Consider the “box” whose base is the kth polar rectangle and whose height is f r k, k ; in polar coordinates.1 We* u will*2 explain= whyc it is so after learning a few examples. its volume is f r , A , for k 1, . , n. Therefore, the volume of the solid region be- A r * r ␪ k k k k k = 1 u2 neath the surface z f r, with a base R is approximately n r ␪ ( k*, k*) Ϸ 1 * u*2 V a f r k, k Ak. = r k 1 r k* 2 r This approximation to the volume is another Riemann sum. We let be the maximum value of r and u. If f is continuous on R, then as S 0, the sum approaches a double ␪ integral; that is, n r 1 u2 = 1 * u*2 r f r, dA lim a f r k, k Ak. k* S 2 O 0 k=1 R

FIGURE 14.30 The next step is to write the double integral as an iterated integral. In order to do so, we must express Ak in terms of r and u. * * ➤ Recall that the area of a sector of a circle Figure 14.30 shows the kth polar rectangle, with an area Ak. The point 1r k, uk2 is u * of radius r subtended by an angle is chosen so that the outer arc of the polar rectangle has radius r k + r>2 and the inner arc 1 2u * 2 r . has radius r k - r>2. The area of the polar rectangle is

Ak = 1area of outer sector2 - 1area of inner sector2 1 r 2 1 r 2 r 2 = ar * + b u - ar * - b u Area of sector = u 2 k 2 2 k 2 2 = * u ␪ r k r . Expand and simplify.

Substituting this expression for Ak into the Riemann sum, we have r n n 1 * u*2 = 1 * u*2 * u a f r k, k Ak a f r k, k r k r . k=1 k=1 58 Multiple Integrations

 Example 3.5 Evaluate the integral:

(x2 + y2)dA ¨R √ where R is the semicircular region bounded by the x-axis and the curve y = 1 − x2. See Figure 3.8.

 Solution The problem is extremely difficult to do in rectangular coordinates. If one attempts to set it up using xy-coordinates, one will get the following double integral: √ 1 1−x2 (x2 + y2)dydx. ˆ−1 ˆ0 After computing the inner integral, one should get: ! 1 p (1 − x2)3/2 x2 1 − x2 + dx. ˆ−1 3

However, things go much better if we switch to polar coordinates, since the region is in the form a ≤ r ≤ b and α ≤ θ ≤ β. From Figure 3.8, the region is defined by:

0 ≤ r ≤ 1, 0 ≤ θ ≤ π.

Keeping in mind that dA = rdrdθ, the required integral is:

θ=π r=1   (x2 + y2)dA = (r cos θ)2 + (r sin θ)2 rdrdθ ¨ ˆ = ˆ = R θ 0 r 0 | {z } x2+y2 θ=π r=1 = r2(cos2 θ + sin2 θ)rdrdθ ˆθ=0 ˆr=0 θ=π r=1 = r3drdθ ˆθ=0 ˆr=0 r=1 14.4 Double Integrals in Polar Form 783 θ=π  r4  = dθ ˆθ=0 4 r=0 EXAMPLE 3 Evaluate = θ π 2 + 2 1 π ex y dy dx, = dθ = . 6 ˆθ=0 4 4 R where R is the semicircular region bounded by the x-axis and the curve y = 21 - x2 (Figure 14.26).

y Solution In Cartesian coordinates, the integral in question is a nonelementary integral 2 2 and there is no direct way to integrate ex +y with respect to either x or y. Yet this integral y 1 x2 and others like it are important in mathematics—in , for example—and we need 1 r 1 to find a way to evaluate it. Polar coordinates save the day. Substituting x = r cos u, y = r sin u and replacing dy dx by r dr du enables us to evaluate the integral as

p 1 p 1 2 + 2 2 1 2 0 ex y dy dx = er r dr du = c er d du x 2 –1 01 6 L0 L0 L0 0 R p 1 p FIGURE 14.26 The semicircular region = se - 1d du = se - 1d. 2 2 Figure 3.8: regionin Example of3 is integrationthe region for Example 3.5 L0 … … … … 2 0 r 1, 0 u p. The r in the r dr du was just what we needed to integrate er . Without it, we would have Annular regions, to be discussed in the next example, are extremelybeen unable to clumsy find an antiderivative using rectangu- for the first (innermost) iterated integral. lar coordinates but relatively easy using polar coordinates. EXAMPLE 4 Evaluate the integral

1 21-x2 (x2 + y 2) dy dx. L0 L0

Solution Integration with respect to y gives

1 (1 - x2)3>2 ax2 21 - x2 + b dx, L0 3 an integral difficult to evaluate without tables. Things go better if we change the original integral to polar coordinates. The region of integration in Cartesian coordinates is given by the inequalites 0 … y … 21 - x2 and 0 … x … 1, which correspond to the interior of the unit quarter circle x2 + y 2 = 1 in the first quadrant. (See Figure 14.26, first quadrant.) Substituting the polar coordinates x = r cos u, y = r sin u, 0 … u … p>2 and 0 … r … 1, and replacing dx dy by r dr du in z the double integral, we get

z 5 9 2 x2 2 y2 1 21-x2 p>2 1 2 2 2 9 (x + y ) dy dx = (r ) r dr du L0 L0 L0 L0 p> = p> 2 r 4 r 1 2 1 p = c d du = du = . L0 4 r=0 L0 4 8

Why is the polar coordinate transformation so effective here? One reason is that x2 + y 2 –2 simplifies to r 2 . Another is that the limits of integration become constants.

EXAMPLE 5 Find the volume of the solid region bounded above by the paraboloid 2 y z = 9 - x2 - y 2 and below by the unit circle in the xy-plane. x2 1 y2 5 1 2 x Solution The region of integration R is the unit circle x2 + y 2 = 1, which is described FIGURE 14.27 The solid region in in polar coordinates by r = 1, 0 … u … 2p. The solid region is shown in Figure 14.27. Example 5. The volume is given by the double integral 1008 Chapter 14 • Multiple Integration

z In polar coordinates, the upper bounding surface of the solid is z = 8 - r 2, and the z 8 x2 y2 lower bounding surface is z = r 2. The volume of the solid is Intersection 2p 2 curve C V = a18 - r 22 - r 2 b r dr du (+)+* ()* L0 L0 upper lower 2p 2 C = 18r - 2r 32 dr du Simplify integrand. L0 L0 p 2 r 4 2 = a4r 2 - b` du Evaluate inner integral. y 4 x2 z x2 y2 L0 2 0 2p R = u 3.3 Double Integrals in Polar Coordinates 59 8 d Simplify. 2 2 L x y 0 = p

Projection of C y 4 x2 16 . Evaluate outer integral.  Example 3.6 Evaluate the following integral: ➤ on xy-plane Related Exercises 19–22

FIGURE 14.32 x dA ¨R EXAMPLE 3 Annular region Find the volume of the region beneath the surface z = xy + 10 and above the annular region R = 51r, u2: 2 … r … 4, 0 … u … 2p6. where R is an annular region shown in Figure 3.9. (An annulus is the region between two concentric circles.) SOLUTION The region of integration suggests working in polar coordinates  Solution The annular region R is defined by inequalities: (Figure 14.33). Substituting x = r cos u and y = r sin u, the integrand becomes 2 ≤ r ≤ 4, 0 ≤ θ ≤ 2π. xy + 10 = 1r cos u21r sin u2 + 10 Substitute for x and y. Therefore, = r 2 sin u cos u + 10 Simplify. = 1 2 u + u = u u 2π 4 2 r sin 2 10. sin 2 2 sin cos x dA = r cos θ · rdrdθ ¨R ˆ0 ˆ2 Substituting the integrand into the , we have 2π 4 2p 4 = r2 cos θ drdθ ˆ ˆ = 11 2 u + 2 u 0 2 V 2 r sin 2 10 r dr d Iterated integral for volume r=4 L L 2π  r3  0 2 = · cos θ dθ 2p 4 ˆ 0 3 r=2 = 11 3 u + 2 u 2 r sin 2 10r dr d Simplify. 2π 56 L L = cos θ dθ 0 2 ˆ0 3 2p 4 4 r 2  56 2π = a sin 2u + 5r b` du Evaluate the inner integral. = sin θ L0 8 2 3 0 2p = 0 = 130 sin 2u + 602 du Simplify. L0 2p y = 1151-cos 2u2 + 60u2` = 120p. Evaluate the outer integral.

4 0 r 2 ➤ r 4 Related Exercises 23–32 More General Polar Regions 4 4 x In Section 14.2 we generalized double integrals over rectangular regions to double inte- R grals over nonrectangular regions. In an analogous way, the method for integrating over a polar rectangle may be extended to more general regions. Consider a region bounded by 4 two rays u = a and u = b, where b - a … 2p, and two curves r = g1u2 and r = h1u2 R {(r, ␪): 2 r 4, 0 ␪ 2␲} (Figure 14.34): Figure 3.9: region of integration for Example 3.6 FIGURE 14.33 R = 51r, u2: 0 … g1u2 … r … h1u2, a … u … b6.

A Notoriously Difficult Single-Variable Integral Next we present an application of evaluating a notoriously difficult single-variable integral using a double integral in polar coordinates. It is famous (or infamous) that the integral:

2 e−x dx ˆ does not have an easy explicit expression. Nonetheless, this integral is extremely important in probability, heat flow and many physics and engineering subjects. While this integral is generally hard to compute, it can be magically done, with the help of a double integral, when the upper/lower limits are: ∞ 2 e−x dx. ˆ0 60 Multiple Integrations

Consider the double integral: ∞ ∞ 2 2 e−x −y dxdy. ˆ0 ˆ0 2 2 2 2 2 Observing that e−x −y = e−x e−y , we can regard e−y as a constant in the inner dx-integral: ∞ ∞ ∞ ∞ ∞ ∞ 2 2 2 2 2  2  e−x −y dxdy = e−x e−y dxdy = e−y e−x dx dy. ˆ0 ˆ0 ˆ0 ˆ0 ˆ0 ˆ0 ∞ 2 Now that e−x dx is a constant, and in particular, independent of y. Therefore, with respect ˆ0 to the outer dy-integral, one can factor it out and yield: ∞ ∞ ∞ ∞ 2  2   2   2  e−y e−x dx dy = e−x dx e−y dy . ˆ0 ˆ0 ˆ0 ˆ0 Now the double integral are split as a product of two single-variable integrals. Note that they two single-variable integrals are the same as x and y are merely dummy variables! Therefore, combining everything we have shown above, we have:

∞ ∞ ∞ ∞ ∞ 2 2 2  2   2   2  e−x −y dxdy = e−x dx e−y dy = e−x dx . ˆ0 ˆ0 ˆ0 ˆ0 ˆ0 ∞ 2 Consequently, in order two evaluate the single-variable integral e−x dx, one can first ˆ0 evaluate the double integral ∞ ∞ 2 2 e−x −y dxdy ˆ0 ˆ0 ∞ 2 and then the square root of the double integral will give the value of e−x dx. ˆ0 In contrast to the single-variable integral, the double integral is relatively easy to find using polar coordinates. The region of integration is the entire first quadrant which, in polar coordinates, can be described as: π 0 ≤ r ≤ ∞, 0 ≤ θ ≤ . 2 Therefore, ∞ ∞ π/2 ∞ 2 2 2 e−x −y dxdy = e−r r drdθ ˆ0 ˆ0 ˆ0 ˆ0 π/2  r=∞ 1 2 d 2 2 = − e−r dθ note that e−r = e−r · (−2r) ˆ0 2 r=0 dr π/2  1  = −0 + dθ ˆ0 2 π = . 4 Therefore, √ ∞ r 2 π π e−x dx = = . ˆ0 4 2 2 Furthermore, e−x is an even function, we also have: ∞ 2 √ e−x dx = π. ˆ−∞ However, this trick does not work well for any similar definite integral such as:

1 2 e−x dx. ˆ0 3.3 Double Integrals in Polar Coordinates 61

Although it is still true (from a similar argument as above) that we have

2 1 ! 1 1 2 2 2 e−x dx = e−x −y dxdy, ˆ0 ˆ0 ˆ0 the double integral is not “polar-friendly” since the region of integration is a square!

Explanation of the Polar Method We end this section by explaining why dA = rdrdθ but not simply dA = drdθ. In rectangular coordinates, the area of a region is calculated by chopping off the region into tiny rectangular pieces with length and width denoted by the changes ∆x and ∆y. The area of each rectangular piece is given by ∆A = ∆x · ∆y. When these rectangular pieces get smaller and smaller, ∆x and ∆y become dx and dy, and so the area of these rectangular pieces becomes dA = dxdy. However, things are a bit different in polar coordinates. Instead of chopping off a given region into tiny rectangles, the region is chopped into “fan-shaped” pieces (see Figure 3.10. Given a pair of ∆r and ∆θ, the area ∆A of the piece is not simply ∆r · ∆θ, but is getting proportionally larger when the piece is further from the origin! Therefore, one needs to multiply r to ∆r · ∆θ in order to accurately reflect the size of ∆A. Precisely, ∆A can be calculated as the difference between the area of an outer sector (with radius r + ∆r) and an inner section (with radius r):

1 2 1 2 ∆A =1006 (r + ∆Chapterr) ∆ θ14− • Multipler ∆ Integrationθ 2 2 | {z } | {z } y y y area of outer sector area of inner sector Examples of 1   polar rectangles = r2 + 2r∆r + (∆r)2 − r2 ∆θb 2 R R 1 2 R = r∆r · ∆θ + (∆r) ∆θ. b 2 b ␤ ␣ ␤ a ␣ When ∆r and ∆θ get smaller and smaller, the last term (∆r)2∆θ 0is negligibleb whenx compared x x to the other term r∆r∆θ. Therefore, as the region is chopped into infinitesimal0 r b pieces, we0 have: r b a r b 0 ␪ q ␣ ␪ ␤ ␣ ␪ ␤ dA = rdrdθ. FIGURE 14.28

R {(r, ␪): a r b, ␣ ␪ ␤} Our approach is to divide 3a, b4 into M subintervals of equal length r = 1b - a2>M. ␪ ␤ We similarly divide 3a, b4 into m subintervals of equal length u = 1b - a2>m. Now (r *, ␪ *) k k A look at the arcs of the circles centered at the origin with radii k r = a, r = a + r, r = a + 2r, c, r = b ␪ ␣ and the rays r b u = a, u = a + u, u = a + 2u, c, u = b ␪ r emanating from the origin (Figure 14.29). These arcs and rays divide the region R into r a n = Mm polar rectangles that we number in a convenient way from k = 1 to k = n. The ␪ 0 1 * u*2 0 area of the kth rectangle is denoted Ak, and we let r k, k be an arbitrary point in that FIGURE 14.29 rectangle. 1 * u*2 Consider the “box” whose base is the kth polar rectangle and whose height is f r k, k ; Figure 3.10: area of ∆A gets larger when r gets larger.1 * u*2 = c its volume is f r , A , for k 1, . , n. Therefore, the volume of the solid region be- A r * r ␪ k k k k k = 1 u2 neath the surface z f r, with a base R is approximately n r ␪ ( k*, k*) Ϸ 1 * u*2 V a f r k, k Ak. = r k 1 r k* 2 r This approximation to the volume is another Riemann sum. We let be the maximum value of r and u. If f is continuous on R, then as S 0, the sum approaches a double ␪ integral; that is, n r 1 u2 = 1 * u*2 r f r, dA lim a f r k, k Ak. k* S 2 O 0 k=1 R

FIGURE 14.30 The next step is to write the double integral as an iterated integral. In order to do so, we must express Ak in terms of r and u. * * ➤ Recall that the area of a sector of a circle Figure 14.30 shows the kth polar rectangle, with an area Ak. The point 1r k, uk2 is u * of radius r subtended by an angle is chosen so that the outer arc of the polar rectangle has radius r k + r>2 and the inner arc 1 2u * 2 r . has radius r k - r>2. The area of the polar rectangle is

Ak = 1area of outer sector2 - 1area of inner sector2 1 r 2 1 r 2 r 2 = ar * + b u - ar * - b u Area of sector = u 2 k 2 2 k 2 2 = * u ␪ r k r . Expand and simplify.

Substituting this expression for Ak into the Riemann sum, we have r n n 1 * u*2 = 1 * u*2 * u a f r k, k Ak a f r k, k r k r . k=1 k=1 62 Multiple Integrations 3.4 Triple Integrals in Rectangular Coordinates We now move one dimension up and talk about triple integrals, whose integrands are functions of three variables such as f (x, y, z). An example of such an integral is:

1 y y f (x, y, z)dxdzdy. ˆ0 ˆ0 ˆz Some triple integrals have important physical meanings. For instance, if f (x, y, z) is the density function, then the triple integral

1 y y f (x, y, z) dxdzdy ˆ0 ˆ0 ˆz represents the mass of the solid (whose shape is determined by the upper/lower limits of the integral). It is because dxdzdy represents (infinitesimal) volume, and density × volume = mass. In physics, some calculations such as the center of mass of an object and the moment of inertia about an axis also involve evaluations of triple integrals.

Pillar-Base Approach Before we see some practical applications of triple integrals, let’s learn how a triple integral can be set-up. One common approach is so-called the pillar-base approach or pillar-shadow approach. Let’s explain this through an example:

 Example 3.7 Let D be the tetrahedral solid bounded by the plane z = y − x, the plane y = 1, the xy-plane and the yz-plane (see Figure 3.11). Evaluate the triple integral:

x2dzdydx. ˚D

 Solution We demonstrate the pillar-base approach in the solution. The so-called pillar is the orange ray labeled M in Figure 3.11, and it determines how the inner-most integral is set-up:

z=? x2dz. ˆz=? Its lower and upper limits of z are determined by, respectively, where the pillar enters the solid and where the pillar leaves the solid. According to the diagram, it enters at z = 0, i.e. the xy-plane; and it leaves at z = y − x. Therefore, the inner-most integral should be:

z=y−x x2dz. ˆz=0

Note that again the integrand x2 does not affect how we set-up this integral. Next we call the other two variables x and y to be base variables or shadow variables. Cast some light on the solid in the direction parallel to the pillar (i.e. z-direction), a shadow appears on the base xy-plane as a triangle. To way to set-up the middle and outer-most integrals are just the same as what we did for double integrals. Since we picked the dydx-order, we draw a sample strip L on the base in the direction of y. This strip enters at y = x and leaves at y = 1, and therefore the middle and outer-most integrals should be set-up as:

x=1 y=1 z=y−x x2 dzdydx. ˆx=0 ˆy=x ˆz=0 | {z } | {z } base pillar 3.4 Triple Integrals in Rectangular Coordinates 63

Finally, the easiest step is to evaluate the integral. This is as straight-forward as in double integrals:

1 1 y−x 1 1 2 2 z=y−x x dzdydx = [x z]z=0 dydx ˆ0 ˆx ˆ0 ˆ0 ˆx 1 1 = x2(y − x)dydx ˆ0 ˆx y=1 1  x2y2  = − x3y dx ˆ0 2 y=x 1  x2 x4  = − x3 − + x4 dx ˆ0 2 2 1  x2 x4  = − x3 + dx ˆ0 2 2 1  x3 x4 x5  = − + 6 4 10 0 1 1 1 1 = − + = . 6 4 10 60 14.5 Triple Integrals in Rectangular Coordinates 791

z Finally we find the x-limits of integration. As the line L parallel to the y-axis in the previous step sweeps out the shadow, the value of x varies from x = 0 to x = 1 at the z 5 y 2 x point (1, 1, 0) (see Figure 14.32). The integral is (0, 1, 1) M 1 1 y-x Fsx, y, zd dz dy dx. L0 Lx L0

D For example, if Fsx, y, zd = 1, we would find the volume of the tetrahedron to be (0, 1, 0) 0 y - x L 1 1 y x (x, y) V = dz dy dx y 5 1 L0 Lx L0 y 5 x 1 1 R = s y - xd dy dx 1 (1, 1, 0) L0 Lx = 1 1 y 1 x = c y 2 - xy d dx L0 2 y=x FIGUREFigure 14.32 3.11: solidThe Dtetrahedronin Example in 3.7. 1 Example 3 showing how the limits of 1 1 = a - x + x 2 b dx The Fubini’s Theoremintegration also holds are for found triple for integrals, the order i.e. dz we dy can dx. evaluate the integral by L0 2 2 different order of integration, say dxdydz or dydxdz (there are 6 possible orders), we should get the same value provided that the upper and lower limits are adjusted to present the same solid. 1 1 1 1 = c x - x 2 + x 3 d 2 2 6  Example 3.8 Let D be the tetrahedral solid in Example 3.7. Evaluate the triple integral 0 1 2 = . x dydzdx 6 ˚D using the dydzdx-order. We get the same result by integrating with the order dy dz dx. From Example 2,

1 1-x 1 V = dy dz dx L0 L0 Lx+z 1 1-x = (1 - x - z) dz dx L0 L0 = - 1 1 z 1 x = c(1 - x)z - z2 d dx L0 2 z=0 1 1 = c(1 - x)2 - (1 - x)2 d dx L0 2 1 1 = (1 - x)2 dx 2 L0 1 1 1 =- (1 - x)3 d = . 6 0 6

Average Value of a Function in Space The average value of a function F over a region D in space is defined by the formula

1 Average value of F over D = FdV. (2) volume of D 9 D

For example, if Fsx, y, zd = 2x2 + y 2 + z2, then the average value of F over D is the average distance of points in D from the origin. If F(x, y, z) is the temperature at (x, y, z) on a solid that occupies a region D in space, then the average value of F over D is the average temperature of the solid. 64 Multiple Integrations

 Solution Now the inner-most integral is with respect to dy, meaning the pillar is pointing along the positive y-axis. It is the ray labeled as M in Figure 3.12. It enters the solid through the plane z = y − x, or equivalently y = x + z, and it leaves at y = 1. Therefore, the inner-most integral should be: y=1 x2 dy. ˆy=x+z The middle and the outer-most variables are z and x, so the base is the shadow of the solid on the xz-plane, which is790 the triangleChapter labeled 14: MultipleR inIntegrals Figure 3.12. Here the order of integration is dzdx, so we draw a sample strip (labeled L) along the z-axis direction. It enters the region through z = 0 and leaves atNext the we line find thex + y-limitsz = 1of, integration. or The line L through (x, y) parallel to the y-axis equivalently, z = 1 − x. Therefore, the whole triple integral shouldenters beR at set y as:=-2s4 - x2d>2 and leaves at y = 2s4 - x2d>2. Finally we find the x-limits of integration. As L sweeps across R, the value of x varies x=1 z=1−x y=1 from x =-2 at s -2, 0, 0d to x = 2 at (2, 0, 0). The volume of D is x2 dydzdx. ˆ ˆ ˆ x=0 z=0 y=x+z V = dz dy dx 9 It can be verified by straight-forward computations that the answer shouldD be the same as 2 2s4-x2d>2 8-x2 -y2 we got in Example 3.7: = dz dy dx L-2L-2s4-x2d>2 Lx2 +3y2 x=1 z=1−x y=1 1 1−x y=1 2 h 2 i 2 2s4-x2d>2 x dydzdx = x y dzdx = s - 2 - 2d ˆ ˆ ˆ ˆ ˆ y=x+z 8 2x 4y dy dx x=0 z=0 y=x+z 0 0 L-2L-2s4-x2d>2

1 1−x 2  2 2  2 y= 2s4-x d>2 = x − x (x + z) dzdx 2 4 3 ˆ ˆ = cs8 - 2x dy - y d dx 0 0 L-2 3 y=-2s4-x2d>2 z=1−x 1  2 2  > 2 3 x z 2 4 - x 2 8 4 - x 2 3 2 = (x − x )z − = dx a2s8 - 2x 2d - a b b dx ˆ0 2 z=0 L-2 B 2 3 2 1  2 2 2  2 3>2 2 3>2 2 x (1 − x) 4 - x 8 4 - x 422 > = (x2 − x3)( − x) − = c8 a dx b - a b d dx = s4 - x 2d3 2 dx 1 2 3 2 3 ˆ0 2L-2 L-2 1  x2 x4  1= 8p22. After integration with the substitution x = 2 sin u = − x3 + dx = (same!) ˆ0 2 2 60 In the next example, we project D onto the xz-plane instead of the xy-plane, to show how to use a different order of integration.

z EXAMPLE 2 Set up the limits of integration for evaluating the triple integral of a function F(x, y, z) over the tetrahedron D with vertices (0, 0, 0), (1, 1, 0), (0, 1, 0), and (0, 1, 1). Use the order of integration dy dz dx. 1 (0, 1, 1) Solution We sketch D along with its “shadow” R in the xz-plane (Figure 14.31). The up- 5 1 L y x z per (right-hand) bounding surface of D lies in the plane y = 1. The lower (left-hand) Line bounding surface lies in the plane y = x + z. The upper boundary of R is the line x 1 z 5 1 D 5 y 1 z = 1 - x. The lower boundary is the line z = 0. R (0, 1, 0) y First we find the y-limits of integration. The line through a typical point (x, z) in R = + = M parallel to the y-axis enters D at y x z and leaves at y 1. (x, z) Next we find the z-limits of integration. The line L through (x, z) parallel to the z-axis x Leaves at enters R at z = 0 and leaves at z = 1 - x. y 5 1 Enters at Finally we find the x-limits of integration. As L sweeps across R, the value of x varies y 5 x 1 z 1 from x = 0 to x = 1. The integral is (1, 1, 0) x 1 1-x 1 Fsx, y, zd dy dz dx. L0 L0 Lx+z Figure 3.12: the pillar-baseFIGURE diagram14.31 Finding for the the limits triple of integral in Example 3.8 integration for evaluating the triple integral of a function defined over the tetrahedron EXAMPLE 3 Integrate F(x, y, z) = 1 over the tetrahedron D in Example 2 in the order The Fubini’s Theorem tells usD (Examples that no 2 and matter 3). what order ofdz integrationdy dx, and then integrate we choose, in the order we dy dz dx. always get the same answer. Therefore, we can sometimes us the notation dV to denote the volume element dxdydz (or any other order). A generic triple integralSolution canFirst be we written find the as: z-limits of integration. A line M parallel to the z-axis through a typical point (x, y) in the xy-plane “shadow” enters the tetrahedron at z = 0 and exits through the upper plane where z = y - x (Figure 14.32). f (x, y, z) dV. Next we find the y-limits of integration. On the xy-plane, where z = 0, the sloped side ˚D of the tetrahedron crosses the plane along the line y = x. A line L through (x, y) parallel to the y-axis enters the shadow in the xy-plane at y = x and exits at y = 1 (Figure 14.32). 3.4 Triple Integrals in Rectangular Coordinates 65

Although Fubini’s Theorem allows us to switch the order of integral, owever, to ease our computation, we sometimes need to choose a smart choice of pillar and base variables. Let’s look at the next example:

2 2 2  Example 3.9 Let D be the solid bounded by the paraboloid y = x + z and y = 16 − 3x − z2 (see Figure 3.13). Find the volume of the solid, i.e. evaluate the integral:

1dV. ˚D

 Solution After taking a careful look at the diagram, one should see that it would be a bad choice if we chose either x or z to be the pillar variable. The solid can be decomposed into two parts as shown in the diagram. If z were chosen to be the pillar direction, then the pillar would enter the yellow part in a way different from it does in the blue part. The yellow p part would have z = ± 16 − 3x2 − y as the lower/upper limits, while the blue part have p z = ± y − x2. Even worse, the part near the intersection of the blue and the yellow surfaces is geometrically complicated – it is not easy to set up the inner integral for that part. However, life is much easier if we choose y as the pillar variable. Since then the y-pillar will enter through the blue surface and leave through the yellow surface. The shadow is an ellipse on the xz-plane. To set-up the inner integral, we note that the y-pillar enters at y = x2 + z2 and leaves at y = 16 − 3x2 − z2: y=16−3x2−z2 dy. ˆy=x2+z2 The shadow (or base) is an ellipse, whose equation can be obtained by setting y = x2 + z2 = 16 − 3x2 − z2, which gives 1020 Chapter 14 • Multiple Integration 2x2 + z2 = 8.

For the baseSOLUTION integral, Wewe identify can the either right choose boundarydzdx of Dor as dxdzthe surface. For y the= 16 former,- 3x 2 the- z sample2; the z-strip √ 2 2 √ enters theleft ellipse boundary at is− y =8 −x 2+x2zand. These leaves surfaces at are8 functions− 2x2. The of x and min/max z, so they values determine of x are −2 and 2 forthe the limits ellipse. of integration Therefore, for the the inner middle integral and in outerthe y-direction. integrals should: A key step in the calculation is finding the curve of intersection between the two √ surfaces and projectingx it= 2onto zthe= xz8−-plane2x2 toy =form16− 3thex2− boundaryz2 of the region R. Equat- ing the y-coordinates of the two surfaces, we have x 2 + z 2 = 16 - 3x 2 - z 2, which √ dydzdx. becomes the equationˆx of= −an2 ˆellipse:z=− 8−2x2 ˆy=x2+z2 4x 2 + 2z 2 = 16, or z = { 28 - 2x 2. Evaluation of this integral is straight-forward√ (but be careful). It is left as an exercise for readers. TheThe projection answer shouldof the solid be region 32π D2. onto the xz-plane is the region R bounded by this ellipse (centered at the origin with axes of length 4 and 412). Here are the observations that lead to the limits of integration with the ordering dy dz dx.

➤ Note that the problem is symmetric about z z the x- and z-axes. Therefore, the integral y x2 z2 8 z x2 over R could be evaluated over one- y 16 3x2 z2 8 2 quarter of R, R 51x, z2: 0 … z … 28 - 2x 2, … … 6 0 x 2 , R x x D y 22 in which case the final result must be multiplied by 4. z x2 Inner integral 8 2 with respect to y 8

2 2 x2z2 163x z 2 82x2 16 3 ͵͵ ͵ dy dA ͵ ͵ ͵ dy dz dx 2 2 R x z 2 82x2 x2z2 (a) (b) FIGURE 14.44 Figure 3.13: the pillar-base diagram for the triple integral in Example 3.9 Inner integral with respect to y: A line through the solid parallel to the y-axis enters the solid at y = x 2 + z 2 and exits at y = 16 - 3x 2 - z 2. Therefore, for fixed values of x and z, we integrate over the interval x 2 + z 2 … y … 16 - 3x 2 - z 2 (Figure 14.44a). Middle integral with respect to z: Now we must cover the region R. A line parallel to the z-axis enters R at z = - 28 - 2x 2 and exits R at z = 28 - 2x 2. Therefore, for a fixed value of x, we integrate over the interval - 28 - 2x 2 … z … 28 - 2x 2 (Figure 14.44b). Outer integral with respect to x: To cover all of R, x must run from x = -2 to x = 2 (Figure 14.44b). 1 2 = Integrating f x, y, z 1, the iterated integral for the volume is

2 28 -2x2 16-3x2 -z2 V = dy dz dx L-2 L-28 -2x2 Lx2 +z2

2 28 -2x2 = 116 - 4x 2 - 2z 22 dz dx Evaluate the inner integral and simplify. L-2L-28 -2x2

2 2 2z 3 28 -2x = a16z - 4x 2z - b` dx Evaluate the middle integral. L-2 3 -28 -2x2

2 1612 > = 14 - x 223 2 dx = 32p12. Evaluate the outer integral. 3 L-2

The last (outer) integral in this calculation requires the trigonometric substitution x = 2 sin u. Related Exercises 25–34 ➤ 1024 Chapter 14 • Multiple Integration

p>2 1 p>2 18. The prism in the first octant bounded by z = 2 - 4x and y = 8 11. sin px cos y sin 2z dy dx dz L0 L0 L0 2 2 1 12. yzex dx dz dy L0 L1 L0

13. 1xy + xz + yz2 dV; D = 51x, y, z2: -1 … x … 1, l D -2 … y … 2, -3 … z … 36 y - 2 - 2 66 14. xyze x y dV; D = 51x, y, z2: 0 … x … 1ln 2, Multiple Integrations l 19. The wedge above the xy-plane formed when the cylinder D + + 2 += 2 = = = - Suppose D 0be… they … tetrahedron1ln 4, 0 solid… z … in the16 first octant bounded by the plane 2x 3y x6z y 4 is cut by the planes z 0 and y z 12 (see Figure 3.14). Now you are asked to evaluate z 15–24. of solids Find the volume of the following solids 2 using triple integrals. dV. ˚D 12 − 3y − 6z 15. The region in the first octant bounded by the plane Which is the best order of integration? Why? 2x + 3y + 6z = 12 and the coordinate planes z y x

) 20. The region bounded by the parabolic cylinder y = x 2 and the planes z = 3 - y and z = 0 z

= 16. The regionFigure in 3.14:the first the octant tetrahedron formedD whenin the the above cylinder question. z sin y, for 0 … y … p, is sliced by the planes y = x and x = 0

y

21. The region between the x 2 + y 2 + z 2 = 19 and the hyperboloid z 2 - x 2 - y 2 = 1, for z 7 0 z

17. The region bounded below by the cone z = 2x 2 + y 2 and bounded above by the sphere x 2 + y 2 + z 2 = 8 z

(0 0 ͙8) 2 2 2 8 22. The region bounded by the surfaces z = ey and z = 1 over the rectangle 51x, y2: 0 … x … 1, 0 … y … ln 26 z y2

ln 2 y 1 x 1028 Chapter 14 • Multiple Integration

Table 14.4 (Continued)

Name Description Example 51 u 2 u = u 6 z Vertical half plane r, , z : 0

x y 0

Horizontal plane 51r, u, z2: z = a6 z

a

y z x

Cone 51r, u, z2: z = ar6, a ϶ 0 z

3.5 Triple Integrals in Cylindrical Coordinates 67 3.5 Triple Integrals in Cylindrical Coordinates y In order to set-up or computex a double integral ofy a circular region, it is often more convenient x to convert the problem into polar coordinates. For triple integrals, there are also some solids that are not so “compatible” with rectangular(a) coordinates but are easily to handle if one convert the problem into cylindrical or spherical coordinates. These solids include cylinders, cones, , etc. z EXAMPLE 1 Sets in cylindrical coordinates Identify and sketch the following sets 3.5.1 Cylindrical Coordinates 1 in cylindrical coordinates. Cylindrical coordinates in R3 is simply combining the polar coordinates for the (x, y)-directions, and keeping the z coordinates. The conversion rule is given by: a. Q = 51r, u, z2: 1 … r … 3, z Ú 06

x = r cos θ b. S = 51r, u, z2: z = 1 - r, 0 … r … 16 y = r sin θ 11 SOLUTION x z = z y (b) a. The set Q is a cylindrical shell with inner radius 1 and outer radius 3 that extends where r ≥ 0, 0 ≤ θ ≤ 2π and z can be any real number. Just like polar coordinates, it is good u to keep in mind that x2 +FIGUREy2 = r2 ,14.48 which will sometimes simplify your calculations. indefinitely Figure along the positive z-axis (Figure 14.48a). Because is unspecified, it 3.15 explains the geometry of these conversion rules. takes on all values. = 51 u 2 = 6 z b. To identify this solid, it helps to work in steps. The set S1 r, , z : z r is a cone that opens upward with its vertex at the origin. Similarly, the set = 51 u 2 = - 6 S2 r, , z : z r is a cone that opens downward with its vertex at the origin. P(r, , z) Therefore, S is S2 shifted vertically upward by one unit; it is a cone that opens down- ward with its vertex at (0, 0, 1). Because 0 … r … 1, the base of the cone is on the y z

tan xy-plane (Figure 14.48b). x ➤ r ͙x2 y2 Related Exercises 11–14 x r cos y Equations for transforming Cartesian coordinates to cylindrical coordinates, and vice x y r sin versa, are often needed for integration. We simply use the rules for polar coordinates (Sec- FIGURE 14.49 tion 11.2) with no change in the z-coordinate (Figure 14.49). Figure 3.15: cylindrical coordinates

If one sets r = constant, then it describes an infinite cylinder in R3 with z-axis as the central axis. Therefore, if the solid is cylindrical in shape, it is usually easier to set-up a triple integral using cylindrical coordinates. Analogous to polar coordinates, the volume element dV is given by: Theorem 3.3 Under cylindrical coordinates (r, θ, z), we have:

dV = rdzdrdθ.

Let’s look at an example:

2 2  Example 3.10 The volume of the solid bounded by two surfaces z = 4 − 4(x + y ) and z = (x2 + y2)2 − 1, as shown in Figure 3.16.

 Solution When using cylindrical coordinates, one often (though not always) set z as the pillar variable, then r and θ become the base variable. The z-pillar enters the solid at z = (x2 + y2)2 − 1, which is z = r4 − 1 in cylindrical coordinates, and leaves the solid at z = 4 − 4(x2 + y2), which is z = 4 − 4r2 in cylindrical coordinates. These determine the lower and upper limit of the inner integral. 68 Multiple Integrations

The shadow is a circle centered at the origin. To find its radius, we solve:

r4 − 1 = z = 4 − 4r2

which gives r = 1. Therefore, the outer and middle integrals have upper and lower limits given by: θ=2π r=1 . ˆθ=0 ˆr=0 Combining all of the above, the volume of the solid is given by:

θ=2π r=1 z=4−4r2 dV = 1 rdzdrdθ ˚D ˆθ=0 ˆr=0 ˆz=r4−1 2π 1 2 = [z]z=4−4r rdrdθ r4−1 ˆ0 ˆ0 2π 1 812 Chapter 14: Multiple Integrals = (4 − 4r2 − r4 + 1)r drdθ ˆ0 ˆ0 2π 1 Finding Iterated Integrals in Spherical Coordinates = volume (of5r D−as4r an3 − iteratedr5) drd tripleθ integral in (a) cylindrical and In Exercises 33–38, (a) find the spherical coordinate limits for the in- ˆ0(b) sphericalˆ0 coordinates. Then (c) find V. r=1 tegral that calculates the volume of the given solid and then (b) evalu- 41. Let2π D be5 rthe2 smaller rcap6  cut from a solid ball of radius 2 units by = − − 4 − ate the integral. a plane 1 unit fromr the centerd ofθ the sphere. Express the volume ˆ0 2 6 r=0 33. The solid between the sphere r = cos f and the hemisphere of D as an iterated triple integral in (a) spherical, (b) cylindrical, 2π  5 1  8π r = 2, z Ú 0 = and (c) rectangular− 1 − coordinates.dθ = . Then (d) find the volume by eval- ˆ0uating 2one of the6 three triple 3integrals. z 42. Express the moment of inertia Iz of the solid hemisphere r 5 2 2 2 r 5 cos f 2 2 x + y + z … 1, z Ú 0, as an iterated integral in (a) cylindri- cal and (b) spherical coordinates. Then (c) find Iz.

Volumes Find the volumes of the solids in Exercises 43–48. 43. 44. 2 2 z z x y z 4 4 (x2 y2) 34. The solid bounded below by the hemisphere r = 1, z Ú 0, and z 5 1 2 r above by the cardioid of revolution r = 1 + cos f

z 21 21 r 5 1 1 cos f 1 r 5 1 x 1 y

x y z (x2 y2)2 1 z 5 21 2 r2

45. 46. Figure 3.16: the solid in Examplez 3.10. z 2 2 z –y z x y x y 35. The solid enclosed by the cardioid of revolution r = 1 - cos f 36. The upper portion cut from the solid in Exercise 35 by the r 3 cos y xy-plane  Example 3.11 Let D be a solid cylinder of radius a with z-axis as the central axis, is bounded 37. The solid bounded below by the sphere r = 2 cos f and above by x r –3 cos by planes z = z and z = z + h, where z and h are constants. Therefore,x the cylinder has the cone z = 2x2 + y 2 0 0 0 y height h. Suppose the solid has uniform density δ. Evaluate the following triple integral: 47. 48. z z 5 x2 1 y2 z 2 2 z Iz := δ(x + y )dV ˚ 2 2 z D 1 x y z 5 31 2 x2 2 y2 which is the moment of inertia of the solid about the z-axis. r 5 2 cos f

x y x 38. The solid bounded below by the xy-plane, on the sides by the y sphere r = 2, and above by the cone f = p>3 r sin x y r 5 cos u z p f 5 49. Sphere and cones Find the volume of the portion of the solid 3 sphere r … a that lies between the cones f = p>3 and f = 2p>3. r 5 2 50. Sphere and half-planes Find the volume of the region cut from the solid sphere r … a by the half-planes u = 0 and u = p>6 in x y the first octant. 51. Sphere and plane Find the volume of the smaller region cut from the solid sphere r … 2 by the plane z = 1. Finding Triple Integrals 52. Cone and planes Find the volume of the solid enclosed by the 39. Set up triple integrals for the volume of the sphere r = 2 in cone z = 2x2 + y 2 between the planes z = 1 and z = 2. (a) spherical, (b) cylindrical, and (c) rectangular coordinates. 53. Cylinder and paraboloid Find the volume of the region 40. Let D be the region in the first octant that is bounded below by bounded below by the plane z = 0, laterally by the cylinder the cone f = p>4 and above by the sphere r = 3. Express the x2 + y 2 = 1, and above by the paraboloid z = x2 + y 2. 3.5 Triple Integrals in Cylindrical Coordinates 69

 Solution We choose the order of integration dzdrdθ. The z-pillar enters the solid at z = z0 and leaves at z = z0 + h. The shadow is the circle with radius a centered at the origin. Therefore,

2π a z0+h I = δ(x2 + y2) · rdzdrdθ z ˆ ˆ ˆ 0 0 z0 | {z } r2 2π a z0+h = δr3dzdrdθ ˆ ˆ ˆ 0 0 z0 2π a = δr3hdrdθ ˆ0 ˆ0 r=a 2π  δhr4  = dθ ˆ0 4 r=0 2π δha4 = dθ ˆ0 4 δhπa4 = . 2 Most physics/engineering textbook expresses the moment of inertia in terms of the total mass m rather than the density δ. To rewrite the above answer in terms of m, we note that: m δ = V

where V is the total volume of the solid, which is πa2h. Therefore, combining with the above calculation, we have: m hπa4 ma2 Iz = · = , πa2h 2 2 which is exactly what you can find in physics or engineering textbooks. 70 Multiple Integrations 3.6 Triple Integrals in Spherical Coordinates Another common in physics and engineering is the spherical coordinates. A point in R3 can be represented by three numbers (ρ, θ, ϕ) which have the following geometric meaning:

ρ = distance from the point to the origin θ = the projected angle on the xy-plane counting from the positive x-axis φ = the angle counting from the positive z-axis 806 Chapter 14: Multiple Integrals

z Therefore, Mxy 32p 1 4 P r f u z = = = , ( , , ) M 3 8p 3

f and the centroid is (0, 0, 4 > 3). Notice that the centroid lies outside the solid. r z 5 r cos f 0 Spherical Coordinates and Integration x u Spherical coordinates locate points in space with two angles and one distance, as shown r 1 y in Figure 14.47. The first coordinate, r = ƒ OP ƒ , is the point’s distance from the origin. 1 y r f x Unlike r, the variable is never negative. The second coordinate, , is the angle OP makes with the positive z-axis. It is required to lie in the interval [0, p]. The third coordinate is the angle u as measured in cylindrical coordinates. FIGUREFigure 14.47 3.17: sphericalThe spherical coordinates coordinates r, f, and u and their relation to x, y, z, and r.

The ranges of (ρ, θ, ϕ) are: DEFINITION Spherical coordinates represent a point P in space by ordered triples sr, f, ud in which ρ ≥ 0, 0 ≤ θ ≤ 2π, 0 ≤ ϕ ≤ π. 1. r is the distance from P to the origin. From standard trigonometry, one can figure out the following conversion rules: 1 2. f is the angle OP makes with the positive z-axis s0 … f … pd. x = ρ sin ϕ cos θ 3. u is the angle from cylindrical coordinates s0 … u … 2pd. y = ρ sin ϕ sin θ

z = zρ cos ϕ On maps of the Earth, u is related to the meridian of a point on the Earth and f to its f f r 5 0, latitude, while is related to elevation above the Earth’s surface. Similar to polar and cylindricalr coordinates,u it isf good to keep in mind that and vary 0 The equation r = a describes the sphere of radius a centered at the origin (Figure 14.48). f u P(a, 0, 0) The equation f = f describes a single cone whose vertex lies at the origin and whose 2 = 2 + 2 + 2 0 ρ x y z . axis lies along the z-axis. (We broaden our interpretation to include the xy-plane as the cone f = p>2. ) If f is greater than p>2, the cone f = f opens downward. The The equation ρ = ρ , where ρ is a positive constant, represents the sphere of radius0 ρ 0 0 0 equation u = u describes the0 half-plane that contains the z-axis and makes an angle u centered at the origin. The equation ϕ = ϕ represents a cone with angle ϕ with the0 origin as 0 0 with the0 positive x-axis. its vertex. Therefore, the spherical coordinates are particular useful when handling spherical or conical objects (see Figure 3.18). To integrate using spherical coordinates, one should note that: Equations Relating Spherical Coordinates to Cartesian u Theorem 3.4 Under the spherical coordinates0 (ρ, θ, ϕ), the volumey elementand Cylindrical is given by: Coordinates r = r sin f, x = r cos u = r sin f cos u, x dV = ρ2 sin ϕ dρdϕdθ. z = r cos f, y = r sin u = r sin f sin u, (1) r 5 a, 2 Infinitesimally, ρ sin ϕ dρdϕdfθ andis theu vary volume of the little cube in red in Figure 3.19. The 2 2 2 2 2 u u r = 2x + y + z = 2r + z . volume of this little piece decreases as it approaches5 0, toward the north/south pole. It accounts r and f vary π for the factor sin ϕ, which is smaller when ϕ is close to 0 or π, but large when ϕ is close to 2 . Although a more precise explanation is available, we omit it in this note because this ρ2 sin ϕ FIGURE 14.48 Constant-coordinate EXAMPLE 3 Find a spherical coordinate equation for the sphere x2 + y 2 + sz - 1d2 = 1. Jacobian can be derived systematically usingequations in spherical coordinates. Meanwhile yield let’s look at some examples on how to use the spherical coordinatesspheres, single to do cones, integration. and half-planes. Solution We use Equations (1) to substitute for x, y, and z: x2 + y 2 + sz - 1d2 = 1

r2 sin2 f cos2 u + r2 sin2 f sin2 u + sr cos f - 1d2 = 1 Eqs. (1) 2 2 s 2 + 2 d + 2 2 - + = r sin f(''')'''*cos u sin u r cos f 2r cos f 1 1 1 2s 2 + 2 d = r (''')'''*sin f cos f 2r cos f 1 r2 = 2r cos f

r = 2 cos f . r 7 0 806 Chapter 14: Multiple Integrals 14.7 Triple Integrals in Cylindrical and Spherical Coordinates 807

z Therefore, z The angle f varies from 0 at the north pole of the sphere to p>2 at the south pole; the Mxy 32p 1 4 2 P r 2f u 2 angle u does not appearz in= the expression= = for, r, reflecting the symmetry about the z-axis x 1( y, 1, )(z 2 1) 5 1 M 3 8p 3 2 r 5 2 cos f (see Figure 14.49). f and the centroid is (0, 0, 4 > 3). Notice that the centroid lies outside the solid. r = 2 2 + 2 z 5 r cos f EXAMPLE 4 Find a spherical coordinate equation for the cone z x y . 0 Spherical Coordinates and Integration x Solution 1 Use geometry. The cone is symmetric with respect to the z-axis and cuts the 1 u Spherical coordinates locate points in space with two angles and one distance, as shown r first quadrant of the yz-plane along1 the line z = y. The angle between the cone and the f y in Figure 14.47. The first coordinate, r = ƒ OP ƒ , is the point’s distance from the origin. r positive z-axis is therefore p>4 . The cone consists of the1 points whose spherical y Unlike r, the variabler is never negative. The second coordinate, f, is the angle OP makes x > = > withcoordinates the positive zhave-axis. fIt is equal required to p to 4,lie soin itsthe equationinterval [0, is p f]. The pthird4. coordinate (See Figure is 14.50.) the angle u as measured in cylindrical coordinates. FIGURE 14.47 The spherical coordinates Solution 2 Use algebra. If we use Equations (1) to substitute for x, y, and z we obtain the r, f, and u and their relation to x, y, z, and r. y same result: x DEFINITION Spherical coordinates represent z = a point2x 2P +in spacey 2 by ordered triples sr, f, ud in which FIGURE 14.49 The sphere in Example 3. = 2 2 2 1. r is the distance from P to the origin. r cos f r sin f Example 3 1 2. f is the angle OP makes with the positive r cos zf-axis= sr0 sin… f f… pd. r 7 0, sin f Ú 0 z 3. u is the angle from cylindrical coordinates s0 … u … 2pd. 3.6 Triple Integrals in Spherical Coordinates 71 cos f = sin f 4 p f = . 0 … f … p z On maps of the Earth, u is related to the meridian of 4a point on the Earth and f to its f f r 5 0, latitude, while is related to elevation above the Earth’s surface. r u f and vary 0 SphericalThe equation coordinates r = a describes are the useful sphere for of radiusdescribing a centered spheres at the origin centered (Figure at 14.48). the origin, half-planes f u P(a, 0, 0) f = f z x2 y2 Thehinged equation along the0 describes z-axis, anda single cones cone whose whose vertices vertex lies lie at at the the origin origin and and whose whose axes lie along axisthe lies z along-axis. the Surfaces z-axis. (Welike broaden these have our interpretation equations of to constant include the coordinate xy-plane as value: the = > > = 4 cone f p 2. ) If f0 is greater than p 2, the cone f f0 opens downward. The equation u = u describes the half-plane that contains the z-axis and makes an angle u 0 = Sphere, radius 4, center 0at origin with the positive x-axis. r 4 p f = Cone opening up from the origin, making an 3 angle ofp>3 radians with the positive z-axis x y Equations Relating Spherical Coordinates to Cartesian u p 0 y and Cylindrical Coordinates u = . Half-plane, hinged along the z-axis, making an 3 angle ofp>3 radians with the positive x-axis FIGURE 14.50 The cone in Example 4. r = r sin f, x = r cos u = r sin f cos u, x When computing z = r cos triple f, integralsy = r sin over u = ar regionsin f sin D u,in spherical coordinates,(1) we partition r 5 a, f and u vary the region into n spherical= 2 2wedges.+ 2 + The2 = size2 2 of+ the2 kth spherical wedge, which contains a u 5 u r x y z r z . Volume Differential 0in, Spherical point sr , f , u d, is given by the changes ¢r , ¢f , and ¢u in r, f, and u. Such a spher- r and f vary k k k k k k Coordinates ical wedge has one edge a circular arc of length rk ¢fk, another edge a circular arc of 2 2 2 2 FIGUREdV 14.48= r Constant-coordinate sin f dr df du EXAMPLElength 3 rk sinFind f ak spherical¢uk, and coordinate thickness equation ¢rk for. The the spherespherical x + wedgey + sz closely- 1d = approximates1. a cube Figureequations 3.18: in spherical coordinate coordinates planes yield of these dimensions when ¢rk, ¢fk, and ¢uk are all small (Figure 14.51). It can be shown spheres, single cones, and half-planes. Solution We use Equations (1) to substitute for x, y, and z: 2 that the volume of this spherical wedge ¢Vk is ¢Vk = rk sin fk ¢rk ¢fk ¢uk for 2 + 2 + s - d2 = z srk, fk, ukd a point chosen inside x they wedge.z 1 1 r sin f rThe2 sin corresponding2 f cos2 u + r2 sin Riemann2 f sin2 u +sumsr cosfor f a -function1d2 = 1 ƒsr, f, ud Eqs. is (1) r f Δu rΔf sin r2 sin2 fscos2 u + sin2 ud + r2 cosn 2 f - 2r cos f + 1 = 1 (''')'''* = s d 2 ¢ ¢ ¢ Sn a ƒ rk, fk, uk rk sin fk rk fk uk . 1 k=1 r2ssin2 f + cos2 fd = 2r cos f As the norm of a partition approaches(''')'''* zero, and the spherical wedges get smaller, the Riemann sums have a limit when ƒ is continuous:1 r2 = 2r cos f f O = s d = r = 2s cos f . d 2r 7 0 r lim Sn ƒ r, f, u dV ƒ r, f, u r sin f dr df du. n: q 9 9 y D D Δr In spherical coordinates, we have u 2 u 1 Δu dV = r sin f dr df du. x To evaluate integrals in spherical coordinates, we usually integrate first with respect to r. FIGURE 14.51 In spherical coordinates2 Figure 3.19: geometric meaning# # of dV = ρ sin ϕ Thedρd ϕproceduredθ. for finding the limits of integration is as follows. We restrict our attention dV = dr r df r sin f du to integrating over domains that are solids of revolution about the z-axis (or portions = r2 sin f dr df du. thereof) and for which the limits for u and f are constant.

 Example 3.12 Consider a solid sphere S of radius r centered at the origin. Suppose it has uniform density δ. Derive the moment of inertia about z-axis, which is given by:

2 2 Iz := δ(x + y )dV ˚S 72 Multiple Integrations

 Solution The solid sphere S can be written as 0 ≤ ρ ≤ r in spherical coordinates. As a full sphere, the ranges for the angles are:

0 ≤ θ ≤ 2π, 0 ≤ ϕ ≤ π.

Therefore, the set-up of the integral is:

θ=2π ϕ=π ρ=r 2 2 2 Iz = δ(x + y )ρ sin ϕ dρdϕdθ. ˆθ=0 ˆϕ=0 ˆρ=0

To compute the integral, we need to express x2 + y2 using spherical coordinates:

x2 + y2 = (ρ sin ϕ cos θ)2 + (ρ sin ϕ sin θ)2 = ρ2 sin2 ϕ.

Therefore,

2π π r 2 2 2 Iz = δ · ρ sin ϕ · ρ sin ϕ dρdϕdθ ˆ0 ˆ0 ˆ0 2π π r = δρ4 sin3 ϕ dρdϕdθ ˆ0 ˆ0 ˆ0 2π !  π   r  = dθ sin3 ϕ dϕ δρ4 dρ ˆ0 ˆ0 ˆ0  π  δr5 = 2π · (1 − cos2 ϕ) sin ϕ dϕ · ˆ0 5 π  cos3 ϕ  δr5 = 2π · − cos ϕ + · 3 0 5  1 1  δr5 = 2π · 1 − + 1 − · . 3 3 5 8πδr5 = . 15 Although it is a perfectly acceptable answer, the moment of inertia in many physics and engineering books is expressed in terms of the total mass m rather than of the density. Since the density in this problem is uniform, one can easily derive the moment of inertia formula in terms of m: 8πr5 m 2mr2 Iz = · = 15 4πr3 5 3 which is what one can find in physics or engineering books.

Conical objects are another type of solids which are compatible with spherical coordinates. π The inequality 0 ≤ ϕ ≤ 3 represents an infinite cone above the xy-plane with cone angle π 3 counting from the positive z-axis. If one further impose the inequality 0 ≤ ρ ≤ 1 which represents the sphere with radius 1 centered at the origin, then the combined inequalities:

π 0 ≤ ρ ≤ 1, 0 ≤ ϕ ≤ 3 represents the common part of the sphere and the cone, which is in an ice-cream shape. Let’s consider the following example:

 Example 3.13 Find the volume of the “ice-cream cone” D cut from the solid sphere ρ ≤ 1 π by the cone ϕ = 3 , as shown in Figure 3.20. 3.6 Triple Integrals in Spherical Coordinates 73

2  Solution The volume is given by ρ sin ϕ dρdϕdθ. The limits of the inner-most integral ˚D are determine by where the ρ-ray, labeled M in the diagram, enters the solid and where it leaves the solid. Evidently, it enters from the origin ρ = 0. When it leaves, it always leaves from the sphere part and never the cone part, so the upper limit should be ρ = 1. π The ϕ-angle run from 0 to 3 , and the θ-ray, labeled L in the diagram, sweeps over the shadow from 0 to 2π. Combining all these limits, the volume is given by the integral:

θ=2π ϕ=π/3 ρ=1 2π ! π/3 ! 1 ! ρ2 sin ϕ dρdϕdθ = dθ sin ϕ dϕ ρ2 dρ 14.7 Triple Integrals in Cylindrical and Spherical Coordinates 809 ˆθ=0 ˆϕ=0 ˆρ=0 ˆ0 ˆ0 ˆ0 1 = 2π · [− cos ϕ]π/3 · 4. Find theu-limits of integration. The ray L sweeps over R as u runs from a to b. These 0 3 are the u-limits of integration. The integral is  1  1 π = 2π · − + 1 · = . u= b f=fmax r=g2sf, ud 2 3 3 ƒsr, f, ud dV = ƒsr, f, ud r2 sin f dr df du. 9 Lu=a Lf=fmin Lr=g1sf, ud D

z EXAMPLE 5 Find the volume of the “ice cream cone” D cut from the solid sphere r … 1 by the cone f = p>3. M D Sphere r 5 1 Solution The volume is V = r2 sin f dr df du, the integral of ƒsr, f, ud = 1 7D over D. To find the limits of integration for evaluating the integral, we begin by sketching D p Cone f 5 and its projection R on the xy-plane (Figure 14.52). 3 Ther-limits of integration. We draw a ray M from the origin through D making an an- R gle f with the positive z-axis. We also draw L, the projection of M on the xy-plane, along u with the angle u that L makes with the positive x-axis. Ray M enters D at r = 0 and leaves at r = 1. x L y Thef-limits of integration. The cone f = p>3 makes an angle of p>3 with the posi- tive z-axis. For any given u, the angle f can run from f = 0 to f = p>3. FIGURE 14.52 The ice cream cone in Figure 3.20: ice cream cone Theu-limits of integration. The ray L sweeps over R as u runs from 0 to 2p. The Example 5. volume is 2p p>3 1 V = r2 sin f dr df du = r2 sin f dr df du 9 L0 L0 L0 D p p> 3 p p> 2 3 r 1 2 3 1 = c d sin f df du = sin f df du L0 L0 3 0 L0 L0 3 p p> p 2 1 3 2 1 1 1 p = c- cos f d du = a- + b du = s2pd = . L0 3 0 L0 6 3 6 3

EXAMPLE 6 A solid of constant density d = 1 occupies the region D in Example 5. Find the solid’s moment of inertia about the z-axis.

Solution In rectangular coordinates, the moment is

2 2 Iz = sx + y d dV. 9

In spherical coordinates, x2 + y 2 = sr sin f cos ud2 + sr sin f sin ud2 = r2 sin2 f. Hence,

2 2 2 4 3 Iz = sr sin fd r sin f dr df du = r sin f dr df du. 9 9 For the region in Example 5, this becomes 2p p>3 1 2p p>3 r5 1 4 3 3 Iz = r sin f dr df du = c d sin f df du L0 L0 L0 L0 L0 5 0

p p> p 3 p> 1 2 3 1 2 cos f 3 = s1 - cos2 fd sin f df du = c-cos f + d du 5L0 L0 5L0 3 0

p p 1 2 1 1 1 1 2 5 1 p = a- + 1 + - b du = du = s2pd = . 5L0 2 24 3 5L0 24 24 12

4 — Vector Calculus

4.1 Vector Fields on R2 and R3 4.1.1 Examples of Vector Fields Vector Calculus is an important tool in physics and engineering. Many concepts in physics, such as gravitational and electrostatic , fluid flow, heat flow, etc. are described using a mathematical concept called vector fields. Gravitational, magnetic and electric forces are not just one single vector. Both directions and magnitudes may vary from place to place and even changing over time. A vector field consists of a collection of vectors that are denoted by a vector-valued function F(x, y, z) : R3 → R3, which assigns to each point (x, y, z) a vector labeled F(x, y, z). For instance, the gravitational field due to the Sun (whose center is at the origin) is given by: xi + yj + zk F(x, y, z) = −GMm , (x2 + y2 + z2)3/2 where G, M and m are constants. That is to say, at the point (1, 0, 0), the gravitational force is   −GMm i, whereas at the point (1, 0, 1), the gravitation force is −GMm 1 i + 1 k . 23/2 23/2 With the help of computer software (such as Mathematica), one can visualize a vector field easily. A plot of the above gravitational field can be found in Figure 4.1.

Figure 4.1: plot of the gravitational force F, the central sphere is the Sun (not a part of the vector field) i = cos(θ)er sin(θ)eθ j = sin(θ)er + cos(θ)eθ. −

In situations with circular symmetry, it is often more natural to describe vector fields in terms of er and eθ rather than i and j. One can translate between the two descriptions as follows:

er = cos(θ)i + sin(θ)j eθ = sin(θ)i + cos(θ)j −

1072 Chapter 15 • Vector Calculus

Shear vector y EXAMPLE 1 Vector fields Sketch representative vectors of the following vector ϭ ͗ ͘ F 0, x fields. 1 2 = 8 9 = a. F x, y 0, x x j (a shear field) 1 2 = 8 - 2 9 = 1 - 22 ͉ ͉ … b. F x, y 1 y , 0 1 y i, for y 1 (channel flow) 1 2 = 8- 9 = - + c. F x, y y, x y i x j (a field) SOLUTION

1 a. This is independent of y. Furthermore, because the x-component of F is 1 zero, all vectors in the field (for x ϶ 0) point in the y-direction: upward for x 7 0 x 76 and downward for x 6 0. The magnitudes of the vectors in the field increase withVector dis- Calculus tance from the y-axis (Figure 15.4). The flow curves for this field are vertical lines. If The generalF represents form of a a velocity vector field, field a inparticleR3 isright given of the by: y-axis moves upward, a particle left of the y-axis moves downward, and a particle on the y-axis is stationary.

b. In this case,F the(x ,vectory, z) field = F xis( independentx, y, z)i + Fofy (xx and, y, thez)j y+-componentFz(x, y, z of)k F is zero. Because 1 - y 2 7 0 for ͉y͉ 6 1, vectors in this region point in the positive where each ofx-direction.Fx, Fy and TheF zx-componentis a scalar-valued of the vector function. field is zero at the boundaries y = {1 and increases to 1 along the center of the strip, y = 0. The vector field might model FIGURE 15.4 ∂F i Do NOTthe flow confuse of waterFx inwith a straight thepartial shallow derivativechannel (Figure∂x ! 15.5 Although); its flow we curves used are the hori- subscript notationzontalFx tolines, denote indicating a partial motion derivative in the direction before, of wethe shouldpositive avoidx-axis. using it in this chapter. Here Fx means the x-component of the vector field F. ➤ Drawing vectors with their actual length c. It often helps to determine the vector field along the coordinate axes. often leads to cluttered pictures of vector = 1 2 = 8 9 7 fields. For this reason, most of the vectorMany vector• When fields y in0 (along physics the x-axis), are three-dimensional. we have F x, 0 0, x To. With start x with,0, this we vector will also study 2 fields in this chapter are illustratedtwo-dimensional with field vector consists fields. of vectors A pointing two-dimensional upward, increasing vector in fieldlength isas ax collectionincreases. With of vectors in R 6 ͉ ͉ proportional scaling: All vectorsthat are are denotedx by0, athe vector-valued vectors point downward, function increasingF : R2 → in lengthR2, which as x increases. assigns to each point (x, y) a multiplied by a scalar chosen to make = 1 2 = 8- 9 7 vector F(x, y)•. When The generalx 0 (along form the of y-axis), a two-dimensional we have F 0, y vectory, 0 field. If y is given0, the vectors by: the vector field as understandable as point in the negative x-direction, increasing in length as y increases. If y 6 0, the possible. ͉ ͉ vectors point in the positiveF(x, yx)-direction, = Fx(x, increasingy)i + Fy (inx ,lengthy)j. as y increases. ➤ A useful observation for two-dimensional A few more representative vectors show that the vector field has a counterclockwise ( ) = vector fields F = 8 f, g9 is that the Figureslope 4.2rotation shows about the the plot origin; of the two magnitudes examples of the of vectors two-dimensional increase with distance vector from fields: the F x, y 1 2 1 2> 1 2 2 of the vector at x, y is g x, y(1f −x, yy. )i andoriginF(x, y(Figure) = − 15.6yi +). xj. In Example 1a, the are everywhere undefined; in part (b), the slopes are y everywhere 0, and in part (c), the slopes Rotation vector field are -x>y. F ϭ ͗Ϫy, x͘

Channel flow y 2 F ϭ ͗1 Ϫ y2, 0͘ 1

Ϫ22 x

Ϫ2 Ϫ11x

Ϫ1

FIGURE 15.5 FIGURE 15.6

Figure 4.2: two examples of vector fieldsRelated in R2 Exercises 6–16 ➤

QUICK CHECK 1 If the vector field in Example 1c describes the velocity of a fluid and you place a small cork in the plane at 12, 02, what path will it follow? ➤ 4.1.2 Vector Fields in Polar Coordinates Sometimes it is more convenient to use other coordinate systems to represent a vector field, especially those which are radially symmetric (such as gravitation and electrostatic forces). Think of i and j are the unit vectors in the increasing x and y directions in R2 respectively. Two dimensions One can also define a pair of basis vectors er and eθ for polar coordinates, which are the unit vectors in the increasing r and θ directions respectively. See Figure 4.3. At any point in the plane, we can define vectors rr and eθ as shown:

j

i eθ er

Figure 4.3: basis vectors er and eθ in polar coordinates

This is worthwhile to note that unlike i and j in rectangular coordinates, the pair er and eθ depends on the base point! To express er and eθ in terms of i and j, one can do the following: 4.1 Vector Fields on R2 and R3 77

Take eθ as an example. It is the unit tangent vector along the θ-coordinate curve (with r fixed). Such a curve can be parametrized by:

r(θ) = xi + yj = (r cos θ)i + (r sin θ)j where r is regarded to be a constant. The tangent vector is given by:

r0(θ) = (−r sin θ)i + (r cos θ)j.

Therefore, the unit tangent is given by:

r0(θ) e = = (− sin θ)i + (cos θ)j. θ |r0(θ)|

In essence, one can derive eθ by: 1. differentiate the position vector xi + yj with respect to θ; 2. then divide the resulting vector by its magnitude to get a unit vector. Let’s perform these procedures to derive the unit vector er: ∂  ∂(r cos θ)   ∂(r sin θ)  (xi + yj) = i + j ∂r ∂r ∂r = (cos θ)i + (sin θ)j.

Since it is a unit vector already, we conclude:

er = (cos θ)i + (sin θ)j.

It is worthwhile to note that er and eθ are orthogonal to each other, since er · eθ = 0. A general vector field F(r, θ) in R2 in polar coordinates is given by:

F(r, θ) = Fr(r, θ)er + Fθ(r, θ)eθ where Fr and Fθ are scalar-valued functions, which represent the radial and rotational compo- nents of F respectively. It is sometimes more convenient to use polar coordinates to express a radial or a rotational vector field. For instance, if Fθ = 0, then F is a radial vector field, i.e.

F = Fr(r, θ)er (radial)

On the other hand, if Fr = 0 then F is a rotational vector field, i.e.

F = Fθ(r, θ)eθ (rotational)

A general vector field F = Frer + Fθeθ can be rewritten in terms of i and j using the relations er = (cos θ)i + (sin θ)j and eθ = (− sin θ)i + (cos θ)j:

F = Fr ((cos θ)i + (sin θ)j) + Fθ ((− sin θ)i + (cos θ)j) = (Fr cos θ − Fθ sin θ)i + (Fr sin θ + Fθ cos θ)j.

Since we also have F = Fxi + Fyj, the conversion rule of vectors fields from polar to rectangular coordinates can be stated as:

Frx − Fθy Fx = Fr cos θ − Fθ sin θ = p , x2 + y2

Fry + Fθ x Fy = Fr sin θ + Fθ cos θ = p . x2 + y2

As an example, for a vector field F = reθ, we have Fr = 0 and Fθ = r. Therefore, from the above conversion rules, we have Fx = −y and Fx = x. In rectangular coordinates, this vector field is expressed as F = −yi + xj. 78 Vector Calculus

We next derive some conversion rules of vector field from rectangular to polar coordinates. First we need to write i and j in terms of er and eθ. While it can be worked out by solving the equations er = (cos θ)i + (sin θ)j and eθ = (− sin θ)i + (cos θ)j for i and j, we adopt a more elegant approach. Let i = Aer + Beθ where A and B are to be determined. Observing that er and eθ are orthogonal and they are unit vectors, we consider:

i · er = (Aer + Beθ) · er = Aer · er + Ber · eθ = A + 0.

Therefore, A = i · er. Similarly, one can work out that B = i · eθ. Hence, we have:

i = (i · er)er + (i · eθ)eθ = (cos θ)er − (sin θ)eθ.

Analogously, one can also derive:

j = (j · er)er + (j · eθ)eθ = (sin θ)er + (cos θ)eθ.

With these conversion rules for basis vectors, one can derive conversion rules for components of a vector field:

Fxi + Fyj = Frer + Fθeθ

Fx ((cos θ)er − (sin θ)eθ) + Fy ((sin θ)er + (cos θ)eθ) = Frer + Fθeθ

(Fx cos θ + Fy sin θ)er + (−Fx sin θ + Fy cos θ)eθ = Frer + Fθeθ.

Equating the components, we have:

Fr = Fx cos θ + Fy sin θ

Fθ = −Fx sin θ + Fy cos θ

As an example, given a vector field F = xi + yj in rectangular coordinates, then Fx = x and Fy = y. Therefore,

2 2 Fr = x cos θ + y sin θ = r cos θ + r sin θ = r

Fθ = −x sin θ + y cos θ = −r cos θ sin θ + r sin θ cos θ = 0

In polar coordinates, the vector field can be expressed as F = rer, which is a radial vector field. To summarize all conversion rules derived so far, we have: Polar and Rectangular Conversions Conversion rules of coordinates: q x = r cos θ r = x2 + y2 y y = r sin θ θ = tan−1 x Conversion rules of basis vectors:

i = (cos θ)er − (sin θ)eθ er = (cos θ)i + (sin θ)j j = (sin θ)er + (cos θ)eθ eθ = (− sin θ)i + (cos θ)j 4.1 Vector Fields on R2 and R3 79

Conversion rules of components of vector fields:

Frx Fθy Fx = p − p Fr = Fx cos θ + Fy sin θ x2 + y2 x2 + y2

Fry Fθ x Fy = p + p Fθ = −Fx sin θ + Fy cos θ x2 + y2 x2 + y2

Generally, we do not usually express a vector field in polar coordinates unless it is radial or 2 2 rotational. Take F = (1 − y )i as an example, which we have Fx = 1 − y and Fy = 0. In polar coordinates, the components are given by:

2 2 2 Fr = (1 − y ) cos θ = (1 − r sin θ) cos θ 2 2 2 Fθ = −(1 − y ) cos θ = −(1 − r sin θ) cos θ

2 2 2 2 Therefore, F = {(1 − r sin θ) cos θ}er − {(1 − r sin θ) cos θ}eθ, which is way more compli- cated than it was in rectangular coordinates.

4.1.3 Vector Fields in Cylindrical Coordinates In R3, the cylindrical coordinates are essentially “polar” on the xy-plane, together with the vertical z variable. The basis vectors are denoted as er, eθ and k, which are unit vectors in the direction of increasing r, θ and z respectively. Note that these basis vectors are mutually perpendicular, meaning: er · eθ = 0, er · k = 0, eθ · k = 0.

Figure 4.4: basis unit vectors in cylindrical coordinates

The general form of a vector field in R3 in cylindrical coordinates is:

F(r, θ, z) = Fr(r, θ, z)er + Fθ(r, θ, z)eθ + Fz(r, θ, z)k

3 where Fr, Fθ and Fz are scalar-valued functions on R . The conversion rules for cylindrical coordinates are inherited from those for polar coordi- nates in R2, and the derivation of these conversion rules are almost the same. We omit the derivations here. Cylindrical and Rectangular Conversions Conversion rules of coordinates: q x = r cos θ r = x2 + y2 y y = r sin θ θ = tan−1 x z = z z = z 80 Vector Calculus

Conversion rules of basis vectors:

i = (cos θ)er − (sin θ)eθ er = (cos θ)i + (sin θ)j j = (sin θ)er + (cos θ)eθ eθ = (− sin θ)i + (cos θ)j k = k k = k

Conversion rules of components of vector fields:

Frx Fθy Fx = p − p Fr = Fx cos θ + Fy sin θ x2 + y2 x2 + y2

Fry Fθ x Fy = p + p Fθ = −Fx sin θ + Fy cos θ x2 + y2 x2 + y2

Fz = Fz Fz = Fz

4.1.4 Vector Fields in Spherical Coordinates 3 In spherical coordinates (ρ, θ, ϕ) in R , we label the basis vectors by eρ, eθ and eϕ. Let’s demonstrate how to write eρ in terms of i, j and k explicitly, and leave the other two as exercises. To find eρ, just like in polar coordinates, we first differentiate the position vector r = xi + yj + zk with respect to ρ: Page 3

r = xi + yj + zk = (ρ sin ϕ cos θ)i + (ρ sin ϕ sin θ)j + (ρ cos ϕ)k ∂r = (sin ϕ cos θ)i + (sin ϕ sin θ)j + (cos ϕ)k ∂ρ

∂r Incidentally, ∂ρ is unit, and so:

eρ = (sin ϕ cos θ)i + (sin ϕ sin θ)j + (cos ϕ)k.

Similarly, differentiate r with respect to θ or ϕ, and divide the resulting vector by its length to get a unit vector, one can get the expression of eθ and eϕ as:

eθ = −(sin θ)i + (cos θ)j eϕ = (cos ϕ cos θ)i + (cos ϕ sin θ)j − (sin ϕ)k

Note that also eρ, eθ and eϕ are mutually perpendicular:

eρ · eθ = 0, eρ · eϕ = 0, eθ · eϕ = 0.

Figure 4.5: basis unit vectors in spherical coordinates 4.1 Vector Fields on R2 and R3 81

The general form of a vector field in spherical coordinates is given by:

F(ρ, θ, ϕ) = Fρ(ρ, θ, ϕ)eρ + Fθ(ρ, θ, ϕ)eθ + Fϕ(ρ, θ, ϕ)eϕ.

A vector field of the form F = Fρeρ, i.e. Fθ = Fϕ = 0 at any point, is called a radial vector field. Using spherical coordinates, the gravitational force can be conveniently written as: GMm F = − e , ρ2 ρ which is much more elegant and concise then it were using rectangular coordinates. Since eρ, eθ and eϕ are orthonormal (i.e. mutually perpendicular and unit), one can mimic what we did for polar coordinates to derive conversion rules between spherical and rectangular coordinates. For instance, the vector i can be expressed in terms of eρ, eθ and eϕ by considering:

i = (i · eρ)eρ + (i · eθ)eθ + (i · eϕ)eϕ, and so it only amounts to computing the dot products between i and each of eρ, eθ and eϕ. j and k can be written in terms of eρ, eθ and eϕ in a similar way. After establishing the conversion rules between the basis vectors {i, j, k} and {eρ, eθ, eϕ}, one can derive the conversion rules between the components Fx, Fy, Fy and Fr, Fθ, Fϕ (see the derivation for the polar-rectangular conversion rules). We skip all the detail of derivation here since it is tedious and not enlightening. Spherical and Rectangular Conversions Conversion rules of coordinates: q x = ρ sin ϕ cos θ ρ = x2 + y2 + z2 y y = ρ sin ϕ sin θ θ = tan−1 x p x2 + y2 z = ρ cos ϕ ϕ = tan−1 z Conversion rules of basis vectors:

i = (sin ϕ cos θ)eρ + (cos ϕ cos θ)eϕ − (sin θ)eθ

j = (sin ϕ cos θ)eρ + (cos ϕ sin θ)eϕ + (cos θ)eθ

k = (cos ϕ)eρ − (sin ϕ)eϕ

eρ = (sin ϕ cos θ)i + (sin ϕ sin θ)j + (cos ϕ)k

eθ = −(sin θ)i + (cos θ)j eϕ = (cos ϕ cos θ)i + (cos ϕ sin θ)j − (sin ϕ)k

Conversion rules of components of vector fields:

Fx = (sin ϕ cos θ)Fρ + (cos ϕ cos θ)Fϕ − (sin θ)Fθ

Fy = (sin ϕ sin θ)Fρ + (cos ϕ sin θ)Fϕ + (cos θ)Fθ

Fz = (cos ϕ)Fρ − (sin ϕ)Fϕ

Fρ = (sin ϕ cos θ)Fx + (sin ϕ sin θ)Fy + (cos ϕ)Fz

Fθ = −(sin θ)Fx + (cos θ)Fy

Fϕ = (cos ϕ cos θ)Fx + (cos ϕ sin θ)Fy − (sin ϕ)Fz

We convert the gravitational force GMm F = − e ρ2 ρ 82 Vector Calculus into rectangular coordinates as an example. From the above expression, we have

GMm F = − , F = 0, F = 0. ρ ρ2 θ ϕ

Therefore, GMm GMm x GMm Fx = − (sin ϕ cos θ) = − · = − x ρ2 ρ2 ρ ρ3 GMm GMm y GMm Fy = − (sin ϕ sin θ) = − · = − y ρ2 ρ2 ρ ρ3 GMm GMm z GMm Fz = − (cos ϕ) = − · = − z, ρ2 ρ2 ρ ρ3 and so: GMm xi + yj + zk F = Fxi + Fyj + Fzk = − (xi + yj + zk) = −GMm , ρ3 (x2 + y2 + z2)3/2 which is exactly the same as we stated in the previous subsection. 4.2 Line Integrals of Vector Fields 83 4.2 Line Integrals of Vector Fields 4.2.1 Definition of Line Integrals

Work is defined to be: force × displacement. Precisely, if F is a constant force, and ∆r := r2 − r1 is the displacement vector (which denotes the change of position from r1 to r2, then:

Work = F · ∆r.

However, if the force is not a constant, meaning that either the direction or the magnitude is not uniform, the work done by the force is not as simple as stated above. Likewise, if the path is not a straight-path but a curved one so that ∆r is changing over time, the work done by the force is again a bit more complicated. Line integrals of vector fields are introduced to handle these more complicated scenarios. Definition 4.1 — Line Integrals of Vector Fields. Given a vector field F(x, y, z) and a path C which is parametrized by r(t), a ≤ t ≤ b, the line integral of F over C is defined to be:

b F · r0(t) dt. ˆa

Notation In the spirit of dr = r0(t) dt, we may denote the above line integral in a concise way as: F · dr ˆC where C is the given path.

Let’s first look at some computational examples before we explain the physical and geomet- ric meanings of line integrals.

 Example 4.1 Let F(x, y) = −yi + xj and C be the counter-clockwise path along the circular arc from (1, 0) to (0, 1). Find the line integral C F · dr. ´

 Solution First we parametrize C. A quick sketch of the path should tell us that C can be parametrize by: π r(t) = (cos t)i + (sin t)j, 0 ≤ t ≤ . 2 By the definition of line integrals,

t=π/2 F · dr = F · r0(t) dt ˆC ˆt=0 π/2 = (−yi + xj) · ((− sin t)i + (cos t)j) dt ˆ0 π/2 = (y sin t + x cos t) dt. ˆ0 Along the path C, we have x = cos t and y = sin t according to the parametrization, so we have: π/2   F · dr = sin2 t + cos2 t dt ˆC ˆ0 π/2 π = 1 dt = . ˆ0 2 84 Vector Calculus

A line integral in three dimension can be computed in an exactly the same way as in two dimension. Let’s see one example:

2 2 2  Example 4.2 Let F(x, y, z) = (y − x )i + (z − y )j + (x − z )k and C be a path parametrized 2 3 by: r(t) = ti + t j + t k where 0 ≤ t ≤ 1. Compute the line integral C F · dr. ´

 Solution We are given the parametrization in the problem so we can proceed to the computation of the line integral:

1 D E F · dr = y − x2, z − y2, x − z2 · r0(t) dt ˆC ˆ0 1 D E D E = y − x2, z − y2, x − z2 · 1, 2t, 3t2 dt ˆ0 1   = y − x2 + 2t(z − y2) + 3t2(x − z2) dt. ˆ0

Along the path C, we have x = t, y = t2 and z = t3, so:

1   F · dr = t2 − t2 + 2t(t3 − t4) + 3t2(t − t6) dt ˆC ˆ0 1   29 = 3t3 + 2t4 − 2t5 − 3t8 dt = . ˆ0 60

4.2.2 Physical Meaning of Line Integrals

We introduce line integrals here because C F · dr is exactly equal to the work done by the force F to move a particle from the starting´ point of C to the ending point of C. By the virtue of Riemann sums, the line integral can be thought as:

b F · r0(t) dt ' F · r0(t )∆t ˆ ∑ i i a i Page 4 where we divide the time interval [a, b] into small subdivisions:

a =: t0 < t1 < ... < ti < ... < tn := b.

Assume each subdivision is very small that F and r0 are roughly constants in each subdivi- sion. By definition of derivatives, we have:

0 ∆r(ti) r (ti) ' . ∆ti 4.2 Line Integrals of Vector Fields 85

Therefore, b F · r0(t) dt ' F · ∆r(t ) ˆ ∑ i a i

Since F · ∆r(ti) is the work done by F with displacement ∆r(ti) (as they are roughly constants), summing them up gives the approximated total work done by the force over the whole path C. As n → ∞, the subdivisions become infinitesimal and ∑i F · ∆r(ti) becomes more accurate and approaches to the total work done by the force.

4.2.3 Geometric meaning of line integrals The line integral of a given vector field F over a path C can indicate whether the path C is overall flowing along the vector field. Recall that the line integral is given by:

F · r0(t) dt. ˆC The value of the integral is determined by many factors, including the length of C, the magnitude of F and the velocity of r(t). However, the sign of F · r0(t) is solely determined by the angle θ between F and the tangent vector r0(t) of the curve C, since: 0 0 F · r (t) = |F| r (t) cos θ. It is positive when F and r0(t) make an acute angle, and is negative when they make an obtuse angle. Therefore, the sign of the line integral can roughly reveal whether the path C is along or against the direction of the vector field F. The more positive is the value, the more often the path is traveling along the vector field. This geometric meaning is very vague when compared to the physical meaning, but is a good one for us to develop intuition towards line integrals.

4.2.4 Independence of parametrization Recall that when computing a line integral, we first pick a parametrization of the path. There is one logical gap we need to fill in, namely if we choose two different parametrizations of the same path, will we get the same answer for the line integral? The answer is positive, as we can show it is true using the chain rule: Suppose r(τ), a ≤ τ ≤ b, and r(t), c ≤ t ≤ d, are two parametrizations of a path C. Therefore, their endpoints must match, i.e. τ = a if and only if t = c, and τ = b if and only if t = d. Then, using the τ-parametrization, the line integral is given by:

τ=b F · r0(τ) dτ. ˆτ=a By chain rule, we have: dr dr dt dt r0(τ) = = = r0(t) , dτ dt dτ dτ and so τ=b τ=b dt F · r0(τ) dτ = F · r0(t) dτ. ˆτ=a ˆτ=a dτ dt By , we have dt = dτ dτ and so: τ=b t=d F · r0(τ) dτ = F · r0(t) dt ˆτ=a ˆt=c which is exactly how we defined line integral using t as the parameter. The above shows that no matter how we parametrize the path, as far as the endpoints of the path are kept unchanged, the line integral must be the same. We call this independence of parametrization. Physically speaking, this tells us as far as a particle travels along a fixed path C, the work done by the force does not depend on how fast the particle travels. 86 Vector Calculus

4.2.5 Alternative notations for line integrals In some textbooks, a line integral is denoted using the arc-length parametrization. Recall that a path r(s) is said to be arc-length parametrized if |r0(s)| = 1 for any s, i.e. unit speed. For such a parametrization, it is a convention to use s as a parameter. Since the line integral of a given vector field F over a path C is independent of parametriza- tion, we can also denote the line integral using the s parameter:

F · r0(s) ds. ˆC The value is the same using any other parametrizations. Since r0(s) is a unit vector, conventionally it is often denoted by Tˆ inside a line integral, meaning that it is a unit tangent vector of the path C. Therefore, you may see that occasionally the line integral is denoted by: F · Tˆ ds. ˆC

It simply means C F · dr. Another common´ notation for line integrals is so-called the notation. Recall that r is the position vector r = xi + yj + zk, and so symbolically dr can be regarded as: dr = dxi + dyj + dzk.

Let F = Fxi + Fyj + Fzk, then (again symbolically)

F · dr = Fx dx + Fy dy + Fz dz. Therefore, another common notation for line integrals is:

Fx dx + Fy dy + Fz dz. ˆC

Again, it simply means C F · dr. While the ds-notation´ does not have much practical use, there are some practical advantages for using the differential form notations if the path C is parallel to one of the coordinate axes. Let’s illustrate this through an example:

 Example 4.3 Compute the line integral

−y dx + x dy ˆC where C is the path from (0, 1) down to (0, 0) along the y-axis, then to (1, 0) along the x-axis.

 Solution Note that the path C has two segments. Break C into two segments C1 and C2 where C1 is the path from (0, 1) to (0, 0) along the y-axis, and C2 is the path from (0, 0) to (1, 0) along the x-axis.

(0, 1)

C1

(1, 0) (0, 0) C2

The line integral is broken down into two:

−y dx + x dy = −y dx + x dy + −y dx + x dy. ˆ ˆ ˆ C C1 C2 4.2 Line Integrals of Vector Fields 87

Along C1, we have x = 0, and so dx = 0 too. From this we can immediately tell that

−y dx + x dy = −y d(0) + 0 dy = 0. ˆ ˆ C1 C1

Similarly, along C2, we have y = 0, and so:

−y dx + x dy = −0 dx + x d(0) = 0. ˆ ˆ C2 C2 Combining the two results, we know:

−y dx + x dy = 0. ˆC

To conclude, given a path C parametrized by r(t) and a vector field F = Fxi + Fyj + Fzk, then the following are all equivalent notations for line integrals:

0 F · dr = F · r (t) dt = F · Tˆ ds = Fx dx + Fy dy + Fz dz. ˆC ˆC ˆC ˆC 88 Vector Calculus 4.3 Conservative Vector Fields 4.3.1 Definition and Consequences This section discusses a special type of fields called conservative vector fields. The term conservative is rooted from physics, not politics! We will first study its definition, and then investigate the features that make conservative vector fields distinguished from generic ones. Definition 4.2 — Conservative Vector Field. A vector field F is called a conservative vector field if and only if it is in the form of F = ∇ f where f is a scalar function. The scalar function f is called a potential function of the vector field F.

Recall that ∇ f denotes the gradient vector of f , defined by:

∂ f ∂ f ∂ f ∇ f = i + j + k. ∂x ∂y ∂z

The most preliminary method to determine whether a given vector field is conservative is to solve for the f , as illustrated by the following example:

 Example 4.4 Consider the vector field

F(x, y, z) = (2x + y)i + (x + z3)j + (3yz2 + 1)k.

Determine whether or not F is a conservative vector field. If so, find its potential function f such that F = ∇ f .

 Solution F is conservative if and only if F = ∇ f for some scalar function f , or equivalently, the following equations hold simultaneously:

∂ f 1 = 2x + y ∂x ∂ f 3 2 = x + z ∂y

∂ f 2 3 = 3yz + 1 ∂z

From 1 , one can find f (x, y, z) by integrating 2x + y by x, regarding y to be a constant. In single variable calculus, an integration constant will be added after the integration. However, we are now considering partial derivatives, and not only constants but also y or z will vanish after differentiating by x! In other words, the “integration constant” is no longer a constant but instead a function of y and z. Precisely, we have:

2 4 f (x, y, z) = x + yx + g(y, z)

where g(y, z) is some function of y and z. We will figure out g(y, z) in the remaining steps. By differentiating both sides of 4 with respect to y, we get:

∂ f ∂g = x + . ∂y ∂y

Compare this with 2 , one need to have

∂g = z3. ∂y

An integration by y yields: g(y, z) = yz3 + h(z). 4.3 Conservative Vector Fields 89

Note that from similar principle discussed above for f , the integration “constant” is no longer just a constant but a function not depending on y and hence a function of z only. By differentiating both sides with respect to z, we get:

∂g = 3yz2 + h0(z). ∂z 0 Finally, by comparing this result with 3 , one must have h (z) = 1, and so clearly h(z) = z + C, where C is geniunely a constant this time! Combining all results above, we have

f (x, y, z) = x2 + yx + yz3 + z + C

where C is any real constant. It can be easily checked that F = ∇ f . Therefore, F is a conservative vector field with potential functions given by the above f ’s.

It is important to keep in mind that not all vector fields are conservative! Here is one which such an f does not exist:

 Example 4.5 Let F(x, y) = −yi + xj. Determine whether or not F is a conservative vector field. If so, find its potential function f such that F = ∇ f .

 Solution If f is a scalar function such that F = ∇ f , then we have:

∂ f 1 = −y ∂x ∂ f 2 = x ∂y

Solving 1 for f by integration, we get f (x, y) = −xy + g(y) for some function g(y). However, by differentiating this result with respect to y, we get:

∂ f = −x + g0(y). ∂y

In order to be consistent with 2 , we would require

−x + g0(y) = x, or equivalently, g0(y) = 2x.

However, g0(y) is a function of y, not of x! Since it leads to inconsistency, such an f cannot exist and so F is not conservative.

One important feature of conservative vector fields is the path independence of line integral, meaning that the line integral depends only on the end-points of the path. Precisely, we have:

Theorem 4.1 Given a conservative vector field F = ∇ f , where f is a potential function, then along any path C connecting from point P0(x0, y0, z0) to point P1(x1, y1, z1), then the line integral is given by:

F · dr = f (x1, y1, z1) − f (x0, y0, z0). ˆC

Proof. The proof is a consequence of the multivariable chain rule and the Fundamental Theorem of Calculus. D ∂ f ∂ f ∂ f E Suppose F = ∇ f = ∂x , ∂y , ∂z and the path C is parametrized by:

r(t) = hx(t), y(t), z(t)i , a ≤ t ≤ b. 90 Vector Calculus

Then, dr F · dr = F · dt ˆC ˆC dt dr = ∇ f · dt ˆC dt b  ∂ f ∂ f ∂ f   dx dy dz  = , , · , , dt ˆa ∂x ∂y ∂z dt dt dt b  ∂ f dx ∂ f dy ∂ f dz  = + + dt ˆa ∂x dt ∂y dt ∂z dt b d = f (r(t))dt (chain rule) ˆa dt = f (r(b)) − f (r(a)) (Fundamental Theorem of Calculus)

If (x0, y0, z0) and (x1, y1, z1) are the initial and final positions respectively, then r(a) = hx0, y0, z0i and r(b) = hx1, y1, z1i. Therefore,

F · dr = f (x1, y1, z1) − f (x0, y0, z0). ˆC

 The significance of this theorem is that the RHS depends only on the initial and final points of the curve C, but not the intermediate path. In other words, if C1 and C2 are two paths with the same initial and final points, then we have F · dr = F · dr. Moreover, if C is closed C1 C2 ´ ´ path whose initial and final positions are the same, then C F · dr = 0. ¸ Notation If C is a closed path, meaning that the two endpoints are the same, it is a convention to use denote the line integral as:

F · dr. ˛C

To summarize, we have:

Corollary 4.2 For a conservative vector field F, if C1 and C2 are two paths with the same initial and final positions, then

F · dr = F · dr. ˆ ˆ C1 C2 Moreover, if C is a closed path, then

F · dr = 0. ˛C

Theorem 4.1 can be applied to find the line integral of a conservative vector field if path is too complicated.

3 2  Example 4.6 Consider the vector field F(x, y, z) = (2x + y)i + (x + z )j + (3yz + 1)k which appeared in Example 4.4, and the path C given by the parametric equation

t r(t) = (et sin2012 t)i + (cos2013 t)j + k, 0 ≤ t ≤ 2π. π

Find the line integral C F · dr. ´ 4.3 Conservative Vector Fields 91

 Solution It is clear that direct computation of this line integral is extremely laborious (and may be impossible). Fortunately, it was shown in Example 4.4 that F is a conservative vector field with potential function

f (x, y, z) = x2 + yx + yz3 + z + C.

The initial and final positions of the given path are respectively:

r(0) = h0, 1, 0i r(2π) = h0, 1, 2i.

By Theorem 4.1, the line integral in question is simply given by:

F · dr = f (0, 1, 2) − f (0, 1, 0) = (10 + C) − C = 10. ˆC Alternatively, one can also find this line integral using the path independence property of conservative vector fields. Let L be the straight path connecting from (0, 1, 0) to (0, 1, 2) which are the initial and final positions respectively. Since F is conservative, we must have:

F · dr = F · dr. ˆC ˆL The latter is much easier to figure out: along the line L, we have

x ≡ 0, y ≡ 1, 0 ≤ z ≤ 2.

Therefore

F · dr = (2x + y)dx + (x + z3)dy + (3yz2 + 1)dz ˆL ˆL z=2 = (0 + 1)d(0) + (0 + 23)d(1) + (3 · 1 · z2 + 1)dz ˆz=0 2 = (3z2 + 1)dz ˆ0 2 = z3 + z = 10. 0

By path independence, C F · dr = L F · dr = 10. ´ ´ 4.3.2 Conservation of Total Energy In physics, it is a convention to define the potential function V of a conservative vector fields F(x, y, z) by: F(x, y, z) = −∇V(x, y, z). The scalar potential function V is often called by physicists the potential energy. Let r(t) represent the path of a particle with mass m. Its kinetic energy is defined as:

1 KE = m|r0(t)|2. 2 It is a nice exercise to show that the total energy is conserved: 92 Vector Calculus

 Exercise 4.1 Let F = −∇V and r(t) = hx(t), y(t), z(t)i, show that the total energy:

1 E(t) := m|r0(t)|2 + V(r(t)) 2 is a constant. [Hint: multivariable chain rule and Newton’s Second Law F = mr00(t).]

4.3.3 Gradient Vector in Various Coordinate Systems The gravitational force field between the Earth (with mass M) and a point particle (with mass m) is given by: xi + yj + zk F(x, y, z) = −GmM (x2 + y2 + z2)3/2 where G is the gravitational constant, and the (x, y, z) coordinates are chosen so that (0, 0, 0) is the center of the Earth. It is a very clumsy expression, but we have seen in the previous section, it becomes more succinct if one converts it into spherical coordinates:

GMm F = − e . ρ2 ρ

To determine whether this kind of radially symmetric vector field is conservative and find its potential function, it is more efficient to use the gradient ∇ f in spherical coordinates form: ∂ f 1 ∂ f 1 ∂ f ∇ f = e + e + e . ∂ρ ρ ρ ∂ϕ ϕ ρ sin ϕ ∂θ θ

The derivation of the above conversion is straight-forward but quite tedious. One can first write ∂ f ∂ f ∂ f ∇ f = i + j + k, ∂x ∂y ∂z

then convert each of i, j and k into eρ, eθ and eϕ. Secondly, use the chain rule to express each ∂ f ∂ f ∂ f ∂ f ∂ f ∂ f of ∂x , ∂y and ∂z in terms of ∂ρ , ∂θ and ∂ϕ . We omit this tedious derivation here (readers may verify this as an exercise if you wish). Here we are going to use the spherical form of ∇ to find a potential function for F. Firstly, set F = ∇ f and try to solve for f :

GMm ∂ f 1 ∂ f 1 ∂ f − e = ∇ f = e + e + e . ρ2 ρ ∂ρ ρ ρ ∂ϕ ϕ ρ sin ϕ ∂θ θ

Equating the components, we get a system of equations:

GMm ∂ f − = ρ2 ∂ρ ∂ f 0 = ∂ϕ ∂ f 0 = ∂θ The first equation gives GMm f (ρ, θ, ϕ) = + g(θ, ϕ) ρ where g(θ, ϕ) is an arbitrary function of θ and ϕ. However, the second and the third equations demand that ∂g ∂g = = 0. ∂θ ∂ϕ 4.3 Conservative Vector Fields 93

Therefore, g must be a constant. GMm To conclude, the gravitational force field F is conservative with potential function f = ρ . Many physicists prefer using the convention F = −∇V when defining the potential energy V. Hence, the gravitational potential energy is given by:

GMm V = − . ρ

Likewise, there is an analogous expression for ∇ f in cylindrical coordinates: ∂ f 1 ∂ f ∂ f ∇ f = e + e + k ∂r r r ∂θ θ ∂z

Using this expression, the vector field F = reθ (which is equal to −yi + xj in rectangular coordinates) can be seen to be not conservative. Set F = ∇ f , then:

∂ f 1 ∂ f ∂ f re = e + e + k, θ ∂r r r ∂θ θ ∂z and so: ∂ f 0 = ∂r 1 ∂ f r = r ∂θ ∂ f 0 = ∂z

The second equation shows f (r, θ, z) = r2θ + g(r, z) for some arbitrary function g. Then, we ∂ f ∂g ∂g will have ∂z = ∂z , and so the third equation demands that ∂z = 0. Therefore, g depends only on r, and we can simply write g = g(r). Finally, we have f (r, θ, z) = r2θ + g(r). Differentiation ∂ f 0 with respect to r gives ∂r = 2rθ + g (r). Then the first equation demands that: g0(r) 0 = 2rθ + g0(r), or equivalently: = −θ. 2r However, the LHS would be a function of r but the RHS is a function of θ. This cannot be true. This shows the vector field F is not conservative.

4.3.4 Test It is possible to test whether a given vector field is conservative or not without attempting to find its potential function. This test is commonly called the curl test. The use of term curl comes from the fact that it involves calculations of the curl of a vector field. Definition 4.3 — Curl of a Vector Field. Given a vector field

F = Fxi + Fyj + Fzk,

the curl of the vector field F, denoted either by curl(F) or ∇ × F, is defined as:

 ∂ ∂ ∂  ∇ × F = i + j + k × F i + F j + F k ∂x ∂y ∂z x y z

i j k

= ∂ ∂ ∂ ∂x ∂y ∂z Fx Fy Fz       ∂F ∂Fy ∂F ∂F ∂Fy ∂F = z − i + x − z j + − x k. ∂y ∂z ∂z ∂x ∂x ∂y 94 Vector Calculus

∂ ∂ ∂ i The symbol ∇ := ∂x i + ∂y j + ∂z k should be considered as an rather than a vector. ∂ It carries no physical or geometric meaning. “Multiplying” ∂x by a function P gives the ∂P partial derivative ∂x as the product.

i If F = Fxi + Fyj is a two dimensional vector field, the curl ∇ × F can also be defined by regarding the k-component to be zero, i.e.

F = Fxi + Fyj + 0k.

It can be easily verified that   ∂Fy ∂F ∇ × F = − x k. ∂x ∂y

i ∇ × F is called the curl of F because it measures how circular the vector field F is, as a consequence of the Green’s and Stokes’ Theorems which we will learn very soon.

 Example 4.7 Compute the curl of the vector field:

F = (2x + y)i + (x + z3)j + (3yz2 + 1)k.

 Solution

i j k ∂ ∂ ∂ ∇ × F = ∂x ∂y ∂z 2x + y x + z3 3yz2 + 1  ∂ ∂   ∂ ∂  = (3yz2 + 1) − (x + z3) i + (2x + y) − (3yz2 + 1) j ∂y ∂z ∂z ∂x  ∂ ∂  + (x + z3) − (2x + y) k ∂x ∂y = (3z2 − 3z2)i + (0 − 0)j + (1 − 1)k = 0.

The curl test is a very straight-forward test to check whether a given vector field F is conserva- tive, without solving for the potential function f . Theorem 4.3 — Curl Test. Given a vector field F is defined and continuously differentiable everywhere in R3 (or everywhere in R2 for vector fields in R2), then F is conservative if and only if ∇ × F = 0.

Proof. The theorem has two directions (if and only if). One direction is easy, while the other direction is more technical. We only present the (⇒)-part here The (⇒)-part of the proof is a consequence of the Mixed Partial Theorem. Suppose F = ∇ f , ∂ f ∂ f ∂ f then F = ∂x i + ∂y j + ∂z k, and so

i j k

∂ ∂ ∂ ∇ × F = ∂x ∂y ∂z ∂ f ∂ f ∂ f ∂x ∂y ∂z  ∂2 f ∂2 f   ∂2 f ∂2 f   ∂2 f ∂2 f  = − i + − j + − k ∂y∂z ∂z∂y ∂z∂x ∂x∂z ∂x∂y ∂y∂x = 0i + 0j + 0k = 0.

 4.3 Conservative Vector Fields 95

Therefore, to check whether or not F is conservative assuming it is defined and smooth everywhere, it is not necessary to solve for the potential function f . All is needed is to find ∇ × F which only involves differentiation but not integration. However, the curl test only tells you whether or not the vector field is conservative. It fails to tell you what the potential function is. In Example 4.7, the vector field F is defined (and is smooth) everywhere in R3. Therefore, the curl test can be applied on this F. Since ∇ × F = 0 as shown in the example, the curl test asserts that F is conservative. Note that the curl test only asserts whether or not F is conservative. It provides no information about the potential function f . However, knowing that F is conservative without knowing the potential f is still enlightening as demonstrated in the alternative solution of Problem 4.6: one can calculate the line integral by choosing an easier path. On the other hand, if G = −yi + xj which is defined and is smooth everywhere in R2, the curl test can be applied to G. One can verify that ∇ × G = 2k 6= 0, so the curl test concludes that G is not conservative. There is no need to attempt solving the potential function, as it cannot never exist. It is worthwhile to note that the condition F is defined and continuously differentiable everywhere is crucial when applying the curl test. The following example tells us why: Let

y x F(x, y) = − i + j. x2 + y2 x2 + y2

This vector field is not defined when (x, y) 6= (0, 0). It can be easily verified (left as an exercise) that ∇ × F = 0 for any (x, y) 6= (0, 0). However, when integrating along the unit circle C : r(t) = (cos t)i + (sin t)j where 0 ≤ t ≤ 2π:

−ydx + xdy · = F dr 2 2 ˛C ˛C x + y 2π − sin td(cos t) + cos td(sin t) = 2 2 ˆ0 sin t + cos t 2π (sin2 t + cos2 t)dt = 2 2 ˆ0 sin + cos t 2π = 1dt = 2π 6= 0. ˆ0

C is a closed path, so F cannot be conservative even thought the curl of F is zero! The curl test fails here because the vector field is not defined everywhere.

4.3.5 Simply-Connected Regions

You have seen that if a two-dimensional vector field F is not defined at (0, 0), one cannot always use the curl test to conclude whether it is conservative. However, one can relax the everywhere-defined condition a little bit to allow us to use the curl test on vector fields which can be undefined at some points. First we introduce: Definition 4.4 — Simply Connected Regions. A region Ω is simply connected if every closed loop in Ω can be contracted to a point continuously without leaving the region Ω. 1098 Chapter 15 • Vector Calculus

C DEFINITION Simple and Closed Curves C Suppose a curve C 1in ޒ2 or ޒ32 is described parametrically by r1t2, where … … 1 2 ϶ 1 2 a t b. Then C is a simple curve if r t1 r t2 for all t1 and t2, with 6 6 6 a t1 t2 b; that is, C never intersects itself between its endpoints. The Closed, simple Not closed, simple curve C is closed if r1a2 = r1b2; that is, the initial and terminal points of C are the same (Figure 15.27). C C

Closed, not simple Not closed, not simple In all that follows, we generally assume that R in ޒ2 (or D in ޒ3) is an open region. FIGURE 15.27 Open regions are further classified according to whether they are connected and whether they are simply connected. ➤ Recall that all points of an are interior points. An open set does not contain its boundary points. DEFINITION Connected and Simply Connected Regions An open region R in ޒ2 (or D in ޒ3) is connected if it is possible to connect any two ➤ Roughly speaking, connected means points of R by a continuous curve lying in R. An open region R is simply connected that R is all in one piece and simply if every closed simple curve in R can be deformed and contracted to a point in R connected in ޒ2 means that R has ( Figure 15.28). no holes. ޒ2 and ޒ3 are themselves 96 connected and simply connected. Vector Calculus

QUICK CHECK 1 Is a figure-8 curve simple? Closed? Is a connected? Simply This curve cannot be contracted connected? ➤ to a point and remain in R.

R Test for Conservative Vector Fields R We begin with the central definition of this section.

Connected, Connected, DEFINITION Conservative Vector Field simply connected not simply connected A vector field F is said to be conservative on a region (in ޒ2 or ޒ3) if there exists a .scalar function w such that F = ٌw on that region R R

Suppose that the components of F = 8 f, g, h9 have continuous first partial deriva- Not connected, Not connected, tives on a region D in ޒ3. Also assume that F is conservative, which means by defini- simply connected not simply connected tion that there is a potential function w such that F = ٌw. Matching the components of FIGURE 15.28 F and ٌw, we see that f = w , g = w , and h = w . Recall from Theorem 13.4 that if R2 x y z The set with the origin removed, is not simply-connected, asa thefunction loops has that continuous go around second partial derivatives, the order of differentiation in the R3 the origin cannot be contracted➤ toThe a term point conservative without refers “touching” to the origin.second However, partial derivatives the set does not matter. Under these conditions on w, we conclude with the origin removed is simply-connectedconservation of energy. – draw See aExercise picture 52 to convincethe following: yourself on that! Simply connected regions arefor related an example to of the conservation curl test ofbecause energy one can also apply the curl test even though F is not definedin everywhere a conservative force but field. is defined on a simply• wxy = connectedwyx, which region. implies that fy = gx, For instance, the gravitational vector field • wxz = wzx, which implies that fz = hx, and ➤ Depending on the contextx iand+ theyj + zk • w = w , which implies that g = h . F(x, y, z) = −GMm yz zy z y interpretation of the vector(x2 + field,y2 +thez 2)3/2 potential may be defined such that These observations comprise half of the proof of the following theorem. The remainder of .is not defined on (0, 0, 0) but is definedF = - ٌw on (with everywhere a negative sign). else in R3. Sincethe proof the set is givenR3 with in Section origin 15.4 removed is simply connected, the curl test applies to this vector field! With some straight-forward (but lengthy) computations, one can verify that ∇ × F = 0 for any (x, y, z) 6= (0, 0, 0). Since the curl test applies here, it concludes that F is conservative.

4.3.6 Curl Operator in Various Coordinate Systems Note the definition of ∇ × F, i.e.

i j k       ∂Fz ∂Fy ∂Fx ∂Fz ∂Fy ∂Fx ∇ × F = ∂ ∂ ∂ = − i + − j + − k ∂x ∂y ∂z ∂y ∂z ∂z ∂x ∂x ∂y Fx Fy Fz applies to rectangular coordinates only! For spherical coordinates (and analogously for cylindrical coordinates), ∇ × F is not defined as:

eρ eθ eϕ

∇ × F = ∂ ∂ ∂ WRONG! ∂ρ ∂θ ∂ϕ Fρ Fθ Fϕ Instead, ∇ × F in spherical coordinates is simply converting the curl of F in rectangular coordinates into spherical coordinates using the conversion rules established in the previous section. Namely, we need to: 1. convert all i, j and k into eρ, eθ and eϕ; 2. convert all Fx, Fy and Fz into Fρ, Fθ and Fϕ; 3. use the chain rule (repeatedly) to convert all partial derivatives with respect to x, y and z to partial derivatives with respect to ρ, θ and ϕ. For instance,

∂Fρ ∂Fρ ∂ρ ∂Fρ ∂θ ∂Fρ ∂ϕ = + + . ∂y ∂ρ ∂y ∂θ ∂y ∂ϕ ∂y It is extremely tedious to derive such a conversion rule! We omit the detail of the deriva- tion here. In spherical coordinates, the curl of F can be shown to have a nice determinant formula: 4.3 Conservative Vector Fields 97

eρ ρeϕ (ρ sin ϕ)eθ 1 ∇ × F = ∂ ∂ ∂ . ρ2 sin ϕ ∂ρ ∂ϕ ∂θ Fρ ρFϕ (ρ sin ϕ)Fθ

Although the derivation of the above formula is extremely tedious, one of its neat con- is that if F is radially symmetric, i.e. Fϕ = Fθ = 0 and Fρ depends only on ρ, then:

eρ ρeϕ (ρ sin ϕ)eθ 1 ∇ × F = ∂ ∂ ∂ = 0. ρ2 sin ϕ ∂ρ ∂ϕ ∂θ Fρ 0 0 For instance, the gravitational force field is radially symmetric, so its curl is a zero vector. Although the spherical coordinates are undefined at the origin, it is not an issue when applying the curl test using spherical coordinates since R3 with the origin removed is a simply connected region. Therefore, any radially symmetric vector field R3 is conservative. There is an analogous formula for cylindrical coordinates which is given by the following. Again, we omit its derivation since it is also tedious to do so.

er reθ k 1 ∇ × F = ∂ ∂ ∂ . r ∂r ∂θ ∂z Fr rFθ Fz

Similarly, it can also be checked that for vector fields with Fθ = Fz = 0 and Fr depends only on r, then ∇ × F = 0. However, cylindrical coordinates cannot be defined on the z-axis. Since R3 with the z-axis removed is not simply connected, one needs to be very careful when applying the curl test under cylindrical coordinates! 98 Vector Calculus 4.4 Green’s Theorem 4.4.1 The Theorem and its Uses We will exclusively deal with two-dimensional vector fields in this section. In the previous section, we see that if F is a two-dimensional vector field which is defined on a simply- connected region Ω in R2 (such as the entire R2 plane), then the curl test says ∇ × F = 0 if and only if F is conservative, and so for such a vector field, we have

F · dr = 0 ˛C

for any closed curve C in R2. It is natural to ask if there is any hidden relation between ∇ × F and C F · dr, given that the former being a zero vector implies the latter is zero. The Green’s Theorem¸ gives the relationship between them. Definition 4.5 — Simple Closed Curves. A curve C is called a simple closed curve if the two endpoints coincide and it does not intersect itself at any point (other than the endpoints). 1098 Chapter 15 • Vector Calculus

C DEFINITION Simple and Closed Curves C Suppose a curve C 1in ޒ2 or ޒ32 is described parametrically by r1t2, where … … 1 2 ϶ 1 2 a t b. Then C is a simple curve if r t1 r t2 for all t1 and t2, with 6 6 6 a t1 t2 b; that is, C never intersects itself between its endpoints. The Closed, simple Not closed, simple curve C is closed if r1a2 = r1b2; that is, the initial and terminal points of C are the same (Figure 15.27). C C

Closed, not simple Not closed, not simple In all that follows, we generally assume that R in ޒ2 (or D in ޒ3) is an open region. FIGURE 15.27 Open regions are further classified according to whether they are connected and whether they are simply connected. Theorem 4.4 — Green’s Theorem. Let C be a simple closed curve in R2 which is counter- clockwise oriented. Suppose➤ the Recall curve that allC pointsencloses of an open region set are R. Let F(x, y) be a vector field which is defined and is continuouslyinterior points. differentiable An open set does at every not point in R, then: contain its boundary points. DEFINITION Connected and Simply Connected Regions An open region R in ޒ2 (or D in ޒ3) is connected if it is possible to connect any two · d = (∇ × ) · dA ➤ F r F k points of R by a continuous curve lying in R. An open region R is simply connected ˛RoughlyC speaking,¨R connected means |that{z R is }all in one| piece and{z simply } if every closed simple curve in R can be deformed and contracted to a point in R line integral double integral connected in ޒ2 means that R has ( Figure 15.28). no holes. ޒ2 and ޒ3 are themselves connected and simply connected. i In terms of components of F in rectangular coordinates, i.e. F = Fxi + Fyj, one can easily  ∂F 

verify that ∇ × F = y − ∂Fx k, so the above Green’s TheoremQUICK can be CHECK stated 1 as: Is a figure-8 curve simple? Closed? Is a torus connected? Simply ∂x ∂y This curve cannot be contracted connected? ➤ to a point and remain in R.   ∂Fy ∂F + = − x Fxdx Fydy R dA. Test for Conservative Vector Fields ˛C ¨R ∂x ∂y R We begin with the central definition of this section.

2 i In particular, if ∇ × F = 0 and F is defined on all of R , then the Green’s Theorem tells us that Connected, Connected, DEFINITION Conservative Vector Field simply connected not simply connected F · dr = 0 · k dA = 0 A vector field F is said to be conservative on a region (in ޒ2 or ޒ3) if there exists a ˛ ¨ .C R scalar function w such that F = ٌw on that region for any simple closed curve C. ThisR is exactly what theR curl test tells us!

The proof of the Green’s Theorem is quite technical if the curve C is complicated. The proof of the theorem in some special cases can be found in the textbook. Let’sSuppose first that look the at components some of F = 8 f, g, h9 have continuous first partial deriva- Not connected, Not connected, tives on a region D in ޒ3. Also assume that F is conservative, which means by defini- examples. simply connected not simply connected tion that there is a potential function w such that F = ٌw. Matching the components of FIGURE 15.28 F and ٌw, we see that f = wx, g = wy, and h = wz. Recall from Theorem 13.4 that if a function has continuous second partial derivatives, the order of differentiation in the ➤ The term conservative refers to second partial derivatives does not matter. Under these conditions on w, we conclude conservation of energy. See Exercise 52 the following: for an example of conservation of energy in a conservative force field. • wxy = wyx, which implies that fy = gx,

• wxz = wzx, which implies that fz = hx, and ➤ Depending on the context and the • wyz = wzy, which implies that gz = hy. interpretation of the vector field, the potential may be defined such that These observations comprise half of the proof of the following theorem. The remainder of .F = - ٌw (with a negative sign). the proof is given in Section 15.4 4.4 Green’s Theorem 99

 Example 4.8 Use the Green’s Theorem to evaluate the line integral:

−ydx + xdy ˛C where C is a rectangular loop (0, 0) → (1, 0) → (1, 1) → (0, 1) → (0, 0).

 Solution The vector field corresponding to this line integral is F = −yi + xj, which is defined and is smooth everywhere on R2. The given path C encloses the rectangle R with vertices (0, 0), (1, 0), (1, 1) and (0, 1).

(0, 1) (1, 1)

(0, 0) (1, 0)

By straight-forward calculation, the curl of F is given by:

∇ × F = 2k.

By the Green’s Theorem, we have:

−ydx + xdy = F · dr ˛C ˛C = (∇ × F) · k dA ¨R = 2k · k dA ¨R = 2 dA ¨R = 2 × area of R = 2.

2 2  Example 4.9 Let F = −y i + xyj and C be a simple closed curve in R enclosing a region R which is symmetric across the x-axis. Show that:

F · dr = 0. ˛C

 Solution Since F is defined and is continuously differentiable everywhere, we can apply the Green’s Theorem without any issues. First we compute its curl:

∇ × F = 3yk. 100 Vector Calculus

By the Green’s Theorem

F · dr = (∇ × F) · k dA ˛C ¨R = 3yk · k dA ¨R = 3y dA. ¨R Since the region R is symmetric across the x-axis and 3y is an odd function. If we split R into two: R+ (the part above the x-axis) and R− (the part below the x-axis), then

3y dA and 3y dA ¨R+ ¨R−

are numerically the same except that one of the negative of the other. Therefore, by

3y dA = 3y dA + 3y dA, ¨R ¨R+ ¨R− we get: 3y dA = 0 ¨R since the R+-integral gets canceled out by the R−-integral. Therefore,

F · dr = 0. ˛C

To summarize, applying Green’s Theorem is very strict forward. It often simplifies the computation of a line integral over a closed loop since there is no need to parametrize the curve. If the curve encloses a rectangular region, the Green’s Theorem is an effective tool for evaluating the line integral because a double integral over a rectangle is easy to set up.

4.4.2 Significance of the Green’s Theorem There are far-reaching significance of the Green’s Theorem apart from make computations of line integrals easier. One important geometric or physical significance is that it gives an interpretation of the curl of a two-dimensional vector field. Consider a vector field F in R2 defined and being continuously differentiable everywhere. Suppose C is a very tiny simple closed curve enclosing a tiny region R. Then, the quantity (∇ × F) · k can be regarded as a constant inside the region R. By the Green’s Theorem, we have:

F · dr = (∇ × F) · k dA ' (∇ × F) · k (area of R). ˛C ¨R

In other words, 1 (∇ × F) · k ' F · dr. area of R ˛C

Since C F · dr indicates how circular the vector field F is around the curve C, the above result tells¸ us that the quantity (∇ × F) · k at a point measures the density of circulation around that point. That’s why we call ∇ × F to be curl of F because it is an indicator of curliness of a vector field! As an example, you can verify that the curls of the vector fields F = −yi + xj and G = xi + yj are respectively given by: 4.4 Green’s Theorem 101

(∇ × F) · k = 2 (∇ × G) · k = 0

It suggests that F is more circular than G. By plotting them in Mathematica, we can see that vectors in F are circling around the origin, while vectors in G are diverging from the origin.

4.4.3 Limitations of the Green’s Theorem We have stated in the Green’s Theorem that the vector field F needs to be defined at every point in the region R enclosed by the simple closed curve C. It is a crucial condition! Let’s consider the vector field y x F = − i + j. x2 + y2 x2 + y2

It is not defined at the origin (0, 0). If we let C be the (counter-clockwise) unit circle centered at the origin, then C can be parametrized by:

r(t) = (cos t)i + (sin t)j, 0 ≤ t ≤ 2π.

By computing the line integral over C directly, one should get:

F · dr = F · r0(t) dt = h− sin t, cos ti · h− sin t, cos ti dt = 1 dt = 2π. ˛C ˛C ˛C ˛C However, one can check from direct computations that ∇ × F = 0 at every point except the origin, and is undefined at the origin. Therefore, the double integral

(∇ × F) · k dA ¨R | {z } undefined at (0, 0)

is an ! To summarize, one cannot directly apply the Green’s Theorem if the given curve encloses a point at which the vector field is not defined. In such cases, we should either compute the line integral directly by parametrization, or use some other tools to compute it. We will compute this integral by so-called the hole-drilling technique to be presented in the next subsection. If the curve does not enclose the origin (at which F is undefined), then Green’s Theorem applies to F and such a curve without any issues. For instance, if C is a circle centered at (3, 0) with radius 2, then it does not enclose the origin which is the only point at which F is undefined. The Green’s Theorem does show that

F · dr = (∇ × F) · k dA = 0 dA = 0. ˛C ¨R | {z } ¨R =0 at every point in R

4.4.4 of a Closed Curve The following vector field (discussed in the previous part) y x F = − i + j x2 + y2 x2 + y2

is a famous one which concerns about the winding number of a closed curve. As discussed before, it is not defined at the origin, and ∇ × F = 0 at every point except the origin. If C is a closed curve (not necessarily simple closed) in R2 not passing through the origin, there is a celebrated result in topology that says:

F · dr = 2π × number of times C travels around the origin counter-clockwisely. ˛C 102 Vector Calculus

1 −y + x Consequently, the quantity 2π C r2 dx r2 dy is often called the winding number of the curve C. This number is not only important¸ in pure mathematics but also in physics. Furthermore, the computation of surface flux of a vector field satisfying the inverse square law, as we will see later, are also in the same spirit as the computation of the winding number. Our goal here is to use the Green’s Theorem to explain why this line integral gives the winding number for some not-too-complicated curves.

Circles Centered at Origin We start with the simplest case where the curve is a circle with radius ε centered at the origin, which from now on denoted by Γε. The path is parametrized by:

Γε : r(t) = (ε cos t)i + (ε sin t)j, 0 ≤ t ≤ 2π. Then, by straight-forward calculations, we get:

y x t=2π ε sin t ε cos t − dx + dy = − d(ε cos t) + d(ε sin t) ˛ 2 2 2 2 ˆ 2 2 Γε x + y x + y t=0 ε ε t=2π   = sin2 t + cos2 t dt ˆt=0 t=2π = 1 dt = 2π. ˆt=0 Simple Closed Curves Enclosing the Origin Next we see how to apply the Green’s Theorem on a more arbitrary curve, say the curve C in Figure 4.6a. Page 1

Figure 4.6: a curve circling around the origin once

We are going to show that the line integral C F · dr, where F is defined as before, is equal to 2π. First note that F is not defined at the origin¸ which is enclosed by the curve C, so we can’t directly apply the Green’s Theorem on this curve C. To handle this issue, we drill a hole near the origin, i.e. we remove a tiny ball with radius ε centered at the origin from the region (see Figure 4.6b), and further split the punctured region into two parts R1 and R2 by cutting it through line segments L1 and L2 (see Figure 4.6cd). We label each segment of the boundaries by C1, C2, L1, L2, Γ1 and Γ2 with directions indicated in the diagram. Then, according to the directions of C1, C2 and C, the line integral in question can be expressed as:

F · dr = F · dr = F · dr + F · dr. ˛ ˛ ˛ ˛ C C1+C2 C1 C2 4.4 Green’s Theorem 103

Likewise, according to the directions of Γ1, Γ2 and Γε (recall that Γε is the counter-clockwise circle with radius ε centered at the origin), we have:

F · dr = F · dr = F · dr + F · dr. ˛ ˛ ˛ ˛ Γε Γ1+Γ2 Γ1 Γ2 Now our goal is to show that F · dr = F · dr. ˛ ˛ C Γε Note that we have already computed the RHS, which is 2π. The above result will show that the line integral over C is also 2π. In order to show this, we consider each of the regions R1 and R2 shown in Figure 4.6cd. Since R1 does not enclose the origin and F is defined everywhere in R1, the Green’s Theorem can be applied to the region R1. According to the indicated directions of the boundary curves, we see that: boundary of R1 = C1 + L2 − Γ1 + L1. Therefore, by the Green’s Theorem, we have:

F · dr + F · dr − F · dr + F · dr = (∇ × F) ·k dA = 0. (4.1) ˆ ˆ ˆ ˆ ¨ C1 L2 Γ1 L1 R1 | {z } | {z } =0 F·dr boundary of R1 ¸ Similarly, R2 does not enclose the origin so the Green’s Theorem can be applied to R2:

F · dr − F · dr − F · dr − F · dr = (∇ × F) ·k dA = 0. (4.2) ˆ ˆ ˆ ˆ ¨ C2 L1 Γ2 L2 R2 | {z } | {z } =0 F·dr boundary of R2 ¸ By summing up (4.1) and (4.2) and canceling out the line integrals over L1 and L2, we get:

F · dr + F · dr − F · dr − F · dr = 0. ˆ ˆ ˆ ˆ C1 C2 Γ1 Γ2

Since C = C1 + C2 and Γε = Γ1 + Γ2 according to their directions in the diagram, we have:

F · dr = F · dr = 2π. ˛ ˛ C Γε Self-Intersecting Curves Next, let’s use the above result to deal with some intersecting curves. Consider the curve C in Figure 4.7. Page 2

Figure 4.7: a self-intersecting curve circling around the origin twice

Since we have already know how to deal with any simple closed curve enclosing the origin, we are going to split the curve C into simple closed curves C1, C2 and C3 according to Figure 4.7. 104 Vector Calculus

As both C1 and C2 are simple closed curves enclosing the origin, by our previous discussion we know: F · dr = F · dr = 2π. ˛ ˛ C1 C2

For C3, it is a simple closed curve not enclosing the origin. Therefore, the Green’s Theorem can be applied on C3 without any issues. Hence, we have

F · dr = (∇ × F) · k dA = 0 ˛ ¨ −C3 R where R is the region enclosed by C3. Here we have again used the fact that ∇ × F = 0 inside the region R. According the directions indicated on the diagram, we have:

F · dr = F · dr + F · dr + F · dr = 2π + 2π + 0 = 4π. ˛ ˛ ˛ ˛ C C1 C2 C3 Therefore, the winding number of this curve C is equal to 2. Similar technique can be applied to more complicated curves which go around the origin a lot of times. 4.5 Parametric Surfaces 105 4.5 Parametric Surfaces The goal of this and the next sections is to generalize the Green’s Theorem for vector fields on the xy-plane to the Stokes’ Theorem for vector fields on the xyz-space. While the Green’s Theorem relates the line integral of a closed curve to the double integral over the enclosed region, the Stokes’ Theorem relates the line integral to the an integral over a surface whose 15.6 Surface Integrals 1131 boundary is the a closed curve. For that we need to define and make sense of surface integrals. and divide by the surface area of the sphere. Because the temperature varies continuously 4.5.1 Surfaceover Parametrizations the sphere, adding up means integrating. How do you integrate a function over a sur- To beginface? with, This we question need to leads know tohow surface to integrals describe. a surface in the xyz-space. Recall that a curve It helps to keep curves, arc length, and( )line = integrals( ) + in( mind) + as( we) discuss surfaces, Parallel Concepts in space is represented in parametric form r t x t i y t j z t k, and is thought as the path of asurface particle. area, The and values surface of integrals.x(t), y( tWhat) and wez( discovert) represent about the surfaces coordinates parallels of what the we particle at Curves Surfaces time t. already know about curves—all “lifted” up one dimension. Arc length Surface area To present a surface, we need two , say u and v. The general form of a parametric Parameterized Surfaces Line integrals Surface integralsequation of a surface is: A curve in ޒ2 is defined parametrically by r1t2 = 8x1t2, y1t29, for a … t … b; it requires One-parameter Two-parameter one parameter and twor (dependentu, v) = x (variables.u, v)i + yStepping(u, v)j + upz (oneu, v dimension,)k. to define a sur- description description face in ޒ3 we need two parameters and three dependent variables. Letting u and v be pa- Instead oframeters, regarding the generalu and vparametricas time variables, description we of a regard surface them has the as form the coordinates on a uv-plane, and the vector r(u, v) is a function that associates each point (u, v) on the uv-plane to a point r1u, v2 = 8x1u, v2, y1u, v2, z1u, v29. (x(u, v), y(u, v), z(u, v)) in the xyz-space. Since there are two parameters, the image of the functionWe is a make surface the in assumption the xyz-space. that the In otherparameters words, vary the over function a rectangler(u, v) Rcan= be51u thought, v2: as a transformationa … u … thatb, c “wraps”… v … d the6 (Figureuv-paper 15.42 onto). As the the parameters surface. 1u, v2 vary over R, the vector r1u, v2 = 8x1u, v2, y1u, v2, z1u, v29 sweeps out a surface S in ޒ3.

r(u, v) ϭ ͗x(u, v), y(u, v), z(u, v)͘ v z

d S

R

a b u

y c x

uv xyz FIGURE 15.42 A rectangle in the -plane is mapped to a surface in -space. The function r(u, v) is called a parametrization of the surface, and by what we mean “parametrizing a surface” is to give a parametrization of the surface. As we will see later, We work extensively with three surfaces that are easily described in parametric form. parametrizingAs with a parameterized surface is often curves, the a firstparametric step of description computing of a asurface surface is not integral. unique. i Although we use (u, v) to denote the parameters, you can use whatever pair of variables forCylinders the parameters, In Cartesian provided coordinates, that there the is set no confusion. 51x, y, z2: x = a cos u, y = a sin u, 0 … u … 2p, 0 … z … h6 Parametrization via Coordinate Systems is a cylindrical surface of radius a and height h with its axis along the z-axis. Using the Let’s lookparameters at several u elementary= u and v = examplesz, a parametric such description as cylinders, of the spheres cylinder and is cones. 1 2 = 8 1 2 1 2 1 29 = 8 9  Example 4.10 Findr au, parametrization v x u, v , y u for, v the, z u cylinder, v witha cos radiusu, a sin ru0, vand, with z-axis as the central axis.where 0 … u … 2p and 0 … v … h (Figure 15.43).

r(u, v) ϭ ͗a cos u, a sin u, v͘ z  Solution If, under a certain coordinate system, the surface has one of the coordinates being constant, thenv giving a parametrization to that surface is fairly easy: simply take the other two coordinates as parameters, and define r according to the conversiona rules between this coordinate systemh and the rectangular coordinate system. The cylinder described in the problem can be presented by equationS r = r0 under cylindrical coordinatesR (r, θ, z). The conversion rule between cylindricalh and rectangular

QUICK CHECK 1 Describe the surface 02␲ u

r1u, v2 = 82 cos u, 2 sin u, v9, for y 0 … u … p and 0 … v … 1. ➤ FIGURE 15.43 x 106 Vector Calculus

coordinates is given by:

x = r cos θ y = r sin θ z = z

Fix r = r0, then x, y and z are functions of (θ, z). Simply set:

x(θ, z) = r0 cos θ, y(θ, z) = r0 sin θ, z(θ, z) = z,

then the parametrization is given by:

r(θ, z) = (r0 cos θ)i + (r0 sin θ)j + zk, 0 ≤ θ ≤ 2π, −∞ < z < ∞.

Figure 4.8: parametric plot of a cylinder

One can also specify the range of z so that the parametrization gives a finite cylinder. For instance, r(θ, z) = (r0 cos θ)i + (r0 sin θ)j + zk, 0 ≤ θ ≤ 2π, 0 ≤ z ≤ 1 gives the finite cylinder with height 1 unit (from z = 0 to z = 1). π Similarly, a cone making 4 angle with the z-axis can be represented by z = r in cylindrical π coordinates, or ϕ = 4 in spherical coordinates. Therefore, it is can be parametrized by two different ways:

r1(r, θ) = (r cos θ)i + (r sin θ)j + rk, 0 ≤ r < ∞, 0 ≤ θ ≤ 2π  π   π   π  r (ρ, θ) = ρ sin cos θ i + ρ cos sin θ j + ρ cos k 2 4 4 4  ρ cos θ   ρ sin θ  ρ = √ i + √ j + √ k, 0 ≤ ρ < ∞, 0 ≤ θ ≤ 2π 2 2 2 A sphere with radius 3 can be presented by ρ = 3 in spherical coordinates, and so it can be parametrized by:

r3(θ, ϕ) = (3 sin ϕ cos θ)i + (3 sin ϕ sin θ)j + (3 sin ϕ)k, 0 ≤ θ ≤ 2π, 0 ≤ ϕ ≤ π. Parametrization of Graphs If a surface can be represented by a Cartesian equation (i.e. level-set form) such as x2 + y − z3 = 1 and that you can write one of the variables as a function of the other two variables (such as y = z3 − x2 + 1 in this example), then we can use the other two variables as parameters. For instance, the surface y = z3 − x2 + 1 can be parametrized as: r(x, z) = xi + (z3 − x2 + 1) j + zk. | {z } y 4.5 Parametric Surfaces 107

However, x cannot be written as a function of y and z for the surface y = z3 − x2 + 1, since p x = ± z3 − y + 1 and the ± makes it not a function of y and z. On the other hand, z can be written as a function of x and y, since z3 = x2 + y − 1 and there is exactly one cubic root for x2 + y − 1. However, the resulting parametrization  1/3 r(x, y) = xi + yj + x2 + y − 1 k is not easy to work with.

Some Interesting Surfaces Although surfaces such as cones, spheres and cylinders can be easily parametrized, it is not the case for many interesting surfaces. Below are some interesting examples of surfaces. It is not easy to write down (or even to explain) the parametrization since it demands quite a lot of geometric intuitions. A torus (i.e. donut), for example, has a complicated parametrization given by: r(u, v) = ((3 + cos u) cos v) i + ((3 + cos u) sin v) j + (sin u) k, with 0 ≤ u ≤ 2π, 0 ≤ v ≤ 2π.

The Möbius strip, a famous object in topology, has the following parametrization  v u    v u    v u  r(u, v) = 1 + cos cos u i + 1 + cos sin u j + sin k 2 2 2 2 2 2 where 0 ≤ u ≤ 2π and −1 ≤ v ≤ 1. Since r denotes the position vector xi + yj + zk, we can also write down a parametrization in an equation form (especially when the expression of the parametrization is too long to fit in one line). For instance, the Möbius strip parametrization can be equivalently written as:  v u  x = 1 + cos cos u 2 2  v u  y = 1 + cos sin u 2 2 v u z = sin 2 2 108 Vector Calculus where 0 ≤ u ≤ 2π and −1 ≤ v ≤ 1.

Figure 4.9: Möbius Strip

The following parametrization, which looks intimidating, describes a very beautiful surface called the Klein bottle (see Figure 4.10): 2   x = − cos u 3 cos v − 30 sin u + 90 cos4 u sin u − 60 cos6 u sin u + 5 cos u sin u cos v 15 1 y = − sin u(3 cos v − 3 cos2 u cos v − 48 cos4 u cos v + 48 cos6 u cos v 15 − 60 sin u + 5 cos u sin u cos v − 5 cos3 u cos v sin u − 80 cos5 u sin u cos v + 80 cos7 u sin u cos v) 2 z = (3 + 5 cos u sin u) sin v 15 for 0 ≤ u ≤ π and 0 ≤ v ≤ 2π.

Figure 4.10: the Klein Bottle

Note that the Klein bottle is self-intersecting as depicted in the diagram.

Geometry of Parametric Surfaces Consider a parametrization r(u, v) of a surface. Keeping v = v0 fixed and letting u vary, the parametrization function r(u, v0) depends only on u and it will give a curve on the surface. This curve is often called a u-curve for u is varying. The tangent vector to any u-curve can be computed by taking the u-derivative of r, i.e. ∂r = tangent vector to the u-curve ∂u

Likewise, when u = u0 is fixed while v varies, the function r(u0, v) depends only on v. The curve traced out by this function when v varies is called a v-curve, and ∂r = tangent vector to the v-curve ∂v Page 4

4.5 Parametric Surfaces 109

∂r ∂r Both ∂u and ∂v are tangent to the surface, hence their cross product is normal to the surface (precisely, normal to the tangent plane). A unit normal at each point is therefore given by:

∂r ∂r ∂u × ∂v ˆn = . ∂r × ∂r ∂u ∂v

4.5.2 Surface Integrals We are going to introduce surface integrals in this subsection. Double integrals such as

f (x, y) dA ¨R is a special type of where the region of integration R is on the flat xy-plane. A surface integral is one that the region of integration can be a curved surface. Many geometric and physical quantities, such as surface area, surface flux, and moment of inertia for some shell objects, can be computed using surface integrals. The Stokes’ Theorem to be introduced in the next section also involves surface integrals. We first state the definition of surface integrals, compute some examples and then explain its geometric and physical meaning. Definition 4.6 — Surface Integrals. Given a surface S parametrized by r(u, v) with a ≤ u ≤ b and c ≤ v ≤ d, and a continuous, scaled-valued function f (x, y, z), the surface integral of f over the surface S is denoted and defined to be:

v=d u=b ∂r ∂r f dS = f (r(u, v)) × dudv. ¨S ˆv=c ˆu=a ∂u ∂v

i The line integral of a vector field over a curve C is independent of how we parametrize C. It is also true that the surface integral over a surface S is also independent of the surface parametrization r(u, v). The proof involves change of variables technique in multivariable calculus and is omitted here.

Notation If a surface is closed (meaning it has no boundaries), it is conventional to denote the integral sign by:

‹S Examples of closed surfaces include spheres and torus, while a hemisphere (only the spherical part, the flat part is not included) is not closed since it has a circle as its boundary. 110 Vector Calculus

 Example 4.11 Let S be the sphere of radius a centered at the origin. Evaluate the surface integral   x2 + y2 dS ‹S

 Solution We first parametrize the surface. Using spherical coordinates, the sphere is presented by ρ = a. Take (θ, ϕ) as parameters, then a parametrization of the sphere is given by: r(θ, ϕ) = (a sin ϕ cos θ)i + (a sin ϕ sin θ)j + (a cos ϕ)k where 0 ≤ θ ≤ 2π and 0 ≤ ϕ ≤ π.

∂r × ∂r According to the definition of surface integrals, we need to compute ∂θ ∂ϕ . It is straight-forward, although quite tedious:

∂r = (−a sin ϕ sin θ)i + a sin ϕ cos θ)j + 0k ∂θ ∂r = (a cos ϕ cos θ)i + (a cos ϕ sin θ)j + (−a sin ϕ)k ∂ϕ

i j k ∂r ∂r × = −a sin ϕ sin θ a sin ϕ cos θ 0 ∂θ ∂ϕ a cos ϕ cos θ a cos ϕ sin θ −a sin ϕ = (−a2 sin2 ϕ cos θ)i + (−a2 sin2 ϕ sin θ)j + (−a2 sin ϕ cos ϕ)k

∂r ∂r q × = a4 sin4 ϕ(cos2 θ + sin2 θ) + a4 sin2 ϕ cos2 ϕ ∂θ ∂ϕ q = a4 sin2 ϕ(sin2 ϕ + cos2 ϕ) = a2 sin ϕ.

Next we compute the surface integral. When (x, y, z) is on the sphere S, we have: x = a sin ϕ cos θ and y = a sin ϕ sin θ. Therefore, the integrand is given by:

x2 + y2 = a2 sin2 ϕ(cos2 θ + sin2 θ) = a2 sin2 ϕ.

According to the bounds of θ and ϕ in the parametrization, the surface integral of x2 + y2 over S is given by:

ϕ=π θ=2π     ∂r ∂r x2 + y2 dS = x2 + y2 × dθdϕ ‹S ˆϕ=0 ˆθ=0 ∂θ ∂ϕ ϕ=π θ=2π = a2 sin2 ϕ · a2 sin ϕ dθdϕ ˆϕ=0 ˆθ=0 | {z } | {z } x2+y2 ∂r × ∂r ∂θ ∂ϕ ϕ=π θ=2π = a4 sin3 ϕ dθdϕ ˆϕ=0 ˆθ=0 ϕ=π = 2πa4 sin3 ϕ dϕ ˆϕ=0 4 8πa4 = 2πa4 · = . 3 3 4.5 Parametric Surfaces 111

Here we used the fact from single-variable calculus that:

π 4 sin3 ϕ dϕ = . ˆ0 3 We leave this part as an exercise.

 Example 4.12 Let S be the plane 3x + 2y + z = 1 defined over the region 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1. Compute the surface integral:

(x + y + z) dS. ¨S

 Solution The equation of the given plane can be written as a graph z = 1 − 3x − 2y over the given region 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1. We can take x and y as parameters and so a parametrization of the plane is given by:

r(x, y) = xi + yj + (1 − 3x − 2y)k

where 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1. Next we compute all the ingredients of the surface integral:

∂r = i − 3k ∂x ∂r = j − 2k ∂y ∂r ∂r × = 3i + 2j + k ∂x ∂y

∂r ∂r p √ × = 32 + 22 + 12 = 14 ∂x ∂y x + y + z = x + y + (1 − 3x − 2y) = 1 − 2x − y

Finally we compute the surface integral:

1 1 √ (x + y + z) dS = (1 − 2x − y) · 14 dxdy ¨S ˆ0 ˆ0 √ 1 h ix=1 = 14 x − x2 − xy dy ˆ0 x=0 √ 1 = 14 −y dy ˆ0 " √ #1 y2 14 = − 2 √ 0 14 = − . 2

Surface Element

Now we explain the geometric and physical meaning of surface integrals. Given a para- metric surface S with parametrization r(u, v), a ≤ u ≤ b and c ≤ v ≤ d, from now on we denote: 112 Vector Calculus

Notation = ∂r × ∂r dS ∂u ∂v dudv

It is a called the surface element of the integral. If we subdivide the domain in the uv-plane into small rectangular pieces with area ∆u ∆v, the parametrization r(u, v) transform them into small pieces ∆S on the surface (see Figure 4.11). Page 3

Figure 4.11: geometric meaning of the surface element

If the number of subdivisions is very large so that each ∆S is very small, then ∆S is approximately a parallelogram. The red side of the parallelogram has length approximately

∂r equal to ∂u ∆u (to see this, regard u is the time then the distance traveled after ∆u unit time is approximately the speed × time. Similarly, the blue side of the parallelogram has length

∂r approximately equal to ∂v ∆v. Suppose the angle of the parallelogram is θ, then the area of ∆S is approximately:

∂r ∂r ∂r ∂r ∂r ∂r ∆u · ∆v · sin θ = sin θ · ∆u ∆v = × ∆u ∆v. ∂u ∂v ∂u ∂v ∂u ∂v | {z } ∂r ∂r | ∂u || ∂v | sin θ Infinitesimally, the ∆S becomes the surface element dS, and ∆u ∆v becomes dudv. In other words, the surface element dS presents the area of a very tiny piece of subdivision on the surface. Summing up the area of all tiny subdivisions, we get the surface area of S, i.e.

dS = surface area of S. ¨S Different geometric or physical meanings of the integrand f give different meanings to the surface integral of f over S. For instance, If f is each element f dS means f dS means ¨S 1 area of dS total surface area surface density mass of dS total mass of the surface density × (x2 + y2) moment of inertia of dS moment of inertia of S Let S be the sphere of radius a centered at the origin. We have computed in Example 4.11 that   8πa4 x2 + y2 dS = . ‹S 3 Suppose this spherical shell has a uniform surface density δ, then the integral:   δ x2 + y2 dS ‹S 4.5 Parametric Surfaces 113

is the moment of inertia Iz about z-axis. Since its formula in many physics textbooks is written in terms of the total mass m of the sphere, let’s rewrite it in terms of m.

 2 2 Iz = δ x + y dS ‹S  8πa4  = δ 3 m 8πa4 = · 4πa2 3 2 = ma2. 3

4.5.3 Surface Surface flux is an important type of surface integrals in both mathematics and physics. It will appear in the statement of the Stokes’ Theorem, and also plays an important in electricity and magnetism. In a nutshell, the surface flux of a vector field F through a surface S measures the “amount” of vectors passing through S. Let’s begin our discussion with the simplest case where the vector field F is constant and

the surface is simply a flat plane. Page 5

Figure 4.12: surface flux for uniform vector field through a plane

The force F can be decomposed into two components, one perpendicular to the plane, another parallel to the plane. As the flux counts only the amount of vectors through the plane, only those perpendicular to the plane should be counted. Suppose the force F makes an angle θ with the unit normal vector ˆn, then the perpendicular component of the force has lengthPage 5 F · ˆn = |F| cos θ. Furthermore, there are vectors at every point on the plane. The larger the plane, the more vectors passing through it. Therefore, the flux of F through the plane should be defined as:

(F · ˆn)(area of the plane).

Now consider a curved surface (so that ˆn is no longer constant) and F is no longer a constant vector field. Infinitesimally, each area element can be regarded as a tiny flat parallelogram, and both F and ˆn can be regarded as constant vectors over each of tiny parallelogram. Recall that dS is the area of this element. Therefore, the flux of F through this tiny bit of surface is given by: F · ˆn dS

Figure 4.13: flux through a tiny surface element 114 Vector Calculus

Summing up, the total flux of F through the entire surface is given by a surface integral as stated in the definition below: Definition 4.7 — Surface Flux. Given a vector field F and a surface S, the surface flux of F through S is defined to be the surface integral:

F · ˆn dS ¨S where ˆn denotes unit normal vector to S at each point.

i Note that there are often two choice of ˆn, so the sign of the surface flux depends on which direction of ˆn is chosen. For a closed surface such as a sphere, it is a convention to choose the outward unit normal.

i There are some surfaces which you cannot “choose” a unit normal convention. For instance, if you pick a normal vector on the Möbius strip and let it vary continuously over the strip, the normal vector may end up pointing at opposite direction when it returns to the original position! We call these non-orientable surfaces. We do not define surface flux on non-orientable surfaces.

Generally speaking, we need to parametrize the surface in order to compute the flux. However, there are some exceptions that can be done without parametrization.

 Example 4.13 Consider the gravitational vector field given in spherical coordinates by:

GMm F = − e . ρ2 ρ

Compute the outward flux of F through the sphere S with radius ρ0 centered at the origin, i.e.

F · ˆn dS. ‹S

 Solution Outward flux means the unit normal vector ˆn is chosen to be the outward normal. For a round sphere centered at the origin, clearly we have ˆn = eρ. Therefore, GMm F · ˆn = − e · e ρ2 ρ ρ GMm 2 = − e ρ2 ρ GMm = − . ρ2 4.5 Parametric Surfaces 115

Hence, the outward flux is given by:

GMm · = − F ˆn dS 2 dS ‹S ‹S ρ GMm = − dS (since ρ = ρ on S) ‹ 2 0 S ρ0 GMm = − dS 2 ‹ ρ0 S | {z } surface area of S GMm = − · 2 = − 2 4πρ0 4πGMm. ρ0

i Using the , we will later see that the above flux is always −4πGMm for any closed orientable surface S provided that it encloses the origin. This fact is commonly called the Gauss’s Law for .

Most surface flux cannot be compute as neatly as in the previous example. Generally speak- ing, the surface flux can be computed by parametrizing the surface. Given a parametrization r(u, v), with a ≤ u ≤ b and c ≤ v ≤ d, of a surface S, the unit normal vector n is given by:

∂r ∂r ∂u × ∂v ˆn = ± ∂r × ∂r ∂u ∂v where ± depends on the convention chosen. It looks complicated to compute the normal vector and the surface flux. However, as a coincidence, the area element dS is given by:

∂r ∂r dS = × dudv. ∂u ∂v

· ∂r × ∂r Therefore, when one multiplies F ˆn by dS, the term ∂u ∂v got canceled! Precisely, we have:

Theorem 4.5 Let r(u, v), with a ≤ u ≤ b and c ≤ v ≤ d, be a parametrization of a surface S. The surface flux of a vector field F through S can computed by:

d b  ∂r ∂r  F · ˆn dS = ± F · × dudv ¨S ˆc ˆa ∂u ∂v where ± depends on the chosen convention of ˆn.

Proof. From the given parametrization r(u, v), we have

d b ∂r ∂r F · ˆn dS = F · ˆn × dudv ¨S ˆc ˆa ∂u ∂v d b ∂r ∂r ∂u × ∂v ∂r ∂r = ± F · × dudv ˆc ˆa ∂r × ∂r ∂u ∂v ∂u ∂v d b  ∂r ∂r  = ± F · × dudv. ˆc ˆa ∂u ∂v

 116 Vector Calculus

Figure 4.14: F and S in Example 4.14

 Example 4.14 Consider the vector field: xi + yj + zk F = −GMm . (x2 + y2 + z2)3/2

Let S be part of the horizontal plane z = 1 over the region x2 + y2 ≤ 1. Compute the surface flux of F through S, i.e. F · ˆn dS. ¨S Here ˆn is chosen to be the upward normal.

 Solution First we parametrize the surface S. Since the plane z = 1 is a graph over the xy-plane, it seems the easiest way is to use x and y as parameters. However, the base region x2 + y2 ≤ 1 is not a rectangle so it may be difficult to set up the ranges of x and y for the parametrization. Since the base region is a solid circle, we can use cylindrical coordinates too. Let r(r, θ) = (r cos θ)i + (r sin θ)j + k (since z = 1), 4.5 Parametric Surfaces 117

where 0 ≤ r ≤ 1 and 0 ≤ θ ≤ 2π. Next we compute all the ingredients:

∂r = (cos θ)i + (sin θ)j ∂r ∂r = (−r sin θ)i + (r cos θ)j ∂θ

i j k ∂r ∂r × = cos θ sin θ 0 ∂r ∂θ −r sin θ r cos θ 0 = (r cos2 θ + r sin2 θ)k = rk (which is upward) (r cos θ)i + (r sin θ)j + zk F = −GMm (r2 cos2 θ + r2 sin2 θ + z2)3/2 (r cos θ)i + (r sin θ)j + k = −GMm (since z = 1) (r2 + 1)3/2  ∂r ∂r  GMmr F · × = − . ∂r ∂θ (r2 + 1)3/2

Therefore, by Theorem 4.5, the surface flux is given by

2π 1  ∂r ∂r  F · ˆn dS = F · × drdθ ¨S ˆ0 ˆ0 ∂r ∂θ 2π 1 GMmr = − drdθ 2 3/2 ˆ0 ˆ0 (r + 1) 2π  GMm r=1 = √ dθ (by inspection) 2 ˆ0 1 + r r=0 2π  1  = GMm √ − 1 dθ ˆ0 2  1  = 2πGMm √ − 1 . 2

Figure 4.15: F and S in Example 4.15

 Example 4.15 Let F =√xi − yj. Find the upward flux of F over S which is the upper part of the sphere with radius 2 centered at the origin cut out by the plane z = 1. 118 Vector Calculus

 Solution Since the surface is spherical, it is usually the best to use spherical coordinates√ to parametrize it. Under spherical coordinates, the sphere is represented by ρ = 2, so we use θ and ϕ for the parameters. Let D√ √ √ E r(θ, ϕ) = 2 sin ϕ cos θ, 2 sin ϕ sin θ, 2 cos ϕ .

π The domain of the parameters are 0 ≤ θ ≤ 2π and 0 ≤ ϕ ≤ 4 . As in the previous example, we first compute all the ingredients. Since they are all straight-forward computations, some detail will be omitted here.

∂r D √ √ E = − 2 sin ϕ sin θ, 2 sin ϕ cos θ, 0 ∂θ ∂r D√ √ √ E = 2 cos ϕ cos θ, 2 cos ϕ sin θ, − 2 sin ϕ ∂ϕ ∂r ∂r D E × = −2 cos ϕ sin2 θ, −2 sin ϕ sin2 θ, −2 cos ϕ sin ϕ ∂θ ∂ϕ D√ √ E F = xi − yj = 2 sin ϕ cos θ, − 2 sin ϕ sin θ, 0  ∂r ∂r  √ F · × = −2 2 cos 2θ sin3 ϕ. ∂θ ∂ϕ

∂r ∂r Note that ∂θ × ∂ϕ obtained above is a downward normal since the k-component is negative. The upward flux is given by:

2π π/4  ∂r ∂r  F · ˆn dS = F · − × dϕdθ ¨S ˆ0 ˆ0 ∂θ ∂ϕ 2π π/4 √ = 2 2 cos 2θ sin3 ϕ dϕdθ ˆ0 ˆ0 √ 2π ! π/4 ! = 2 2 cos 2θ dθ sin3 ϕ dϕ = 0. ˆ0 ˆ0 | {z } =0

Physical Interpretations of Surface Flux Generally speaking, the flux of a vector field F through a surface S measures the net “amount” of vectors F passing through S along a chosen direction of normal vector ˆn. The unit of the “amount” depends on the physical meaning of the vector field F. For instance, if u is the velocity vector field of fluid (in ms−1), and the surface area of S has unit m2, then the surface flux u · ˆn dS ¨S has unit m3s−1 and so it measures the net volume of fluid through the surface S in the direction of ˆn. If the flux is positive, then there is more fluid flowing along the direction ˆn than against it. On the other hand, if the flux is negative, there is more fluid flowing against ˆn than along it. If S is a closed surface (such as a sphere, a cube, a torus), it is a convention to take ˆn as the outward unit normal. The surface flux

u · ˆn dS ‹S over this closed surface measures the net volume of fluid flowing in the direction of ˆn through the surface, or in other words, the net volume of fluid flowing out from region D enclosed by S. 4.5 Parametric Surfaces 119

If one denotes $ as the uniform density of the fluid, then

$u · ˆn dS ‹S measures the net mass of the fluid flowing out from the enclosed region D through its boundary S per unit time. Assuming there is no sink or source inside D, by the conservation of mass, the rate of change of the total mass of fluid enclosed by S is related to the surface flux by:

∂ $ dV = − $u · ˆn dS. ∂t ˚D ‹S | {z } total mass in D

Later we can apply the Divergence Theorem on the above relation to derive an important equation to the fluid flow. Now suppose the vector field J represents the transfer of heat energy in unit Joule per second. Note that while energy is a scalar, the transfer of heat at different point may have a different direction and so it is a vector quantity. Again, take S to be a closed surface enclosing a solid region D, then the surface flux of j through S:

J · ˆn dS ‹S measures the amount of heat energy flowing out from the region D through S. This flux integral is commonly called the heat flux through S by physicists. If E is a electric field and, again, S is a closed surface, then the flux integral

E · ˆn dS ‹S is commonly called the electric flux through S. A result by Gauss claims that this flux integral is proportional to the total amount of charges enclosed by the surface. If B is a magnetic field and, again, S is a closed surface, then the flux integral

B · ˆn dS ‹S is commonly called the magnetic flux through S. Gauss said that it must be zero. 120 Vector Calculus 4.6 Stokes’ Theorem This section was written during the author’s first flight of Boeing 787 from Boston to Tokyo. The author is grateful that Japan Airlines offer in-seat power for his laptop throughout the entire 14-hour journey.

4.6.1 Stokes’ Theorem for Simply-Connected Surfaces Recall that the Green’s Theorem relates the line integral of a closed curve to a certain double integral over the region enclosed by the curve. The Stokes’ Theorem is its generalization to the three dimenions, which allows the region to be a curved surface in R3. Theorem 4.6 — Stokes’ Theorem. Let S be an orientable, simply-connected surface in R3, and C be the boundary curve of the surface S. Suppose F is a vector field which is defined and is continuously differentiable on and near the surface S, then we have:

F · dr = (∇ × F) · ˆn dS ˛C ¨S | {z } | {z } line integral surface integral

where ˆn is the unit normal vector to S, with direction determined by the right-hand rule (see Figure 4.16). 15.7 Stokes’ Theorem 1147

n Stokes’ Theorem involves an oriented curve C and an oriented surface S on which n n there are two unit normal vectors at every point. These orientations must be consistent and S the normal vectors must be chosen correctly. Here is the right-hand rule that relates the n orientations of S and C, and determines the choice of the normal vectors: If the fingers of your right hand curl in the positive direction around C, then your right C thumb points in the (general) direction of the vectors normal to S (Figure 15.59).

FIGURE 15.59 A common situation occurs when C has a counterclockwise orientation when viewed from Figure 4.16: right-hand rule above; then, the vectors normal to S point upward.

i By comparing the statements➤ The of right-hand the Green’s rule and tells Stokes’ you Theorems, which of one can easily see that the Green’s Theorem is a special case of the Stokes’ Theorem, in a sense that the former applies to andtwo the normal flat region vectors enclosed at a by point the curve of S onto theuse.xy -plane. TheTHEOREM unit 15.13 Stokes’ Theorem 3 normal vector ˆn for the planeRemember region is that obviously the direction given by ofk if normal the plane curve is travelingLet S be a smooth oriented surface in ޒ with a smooth closed boundary C whose in the counter-clockwise orientation. vectors changes continuously on an orientation is consistent with that of S. Assume that F = 8 f, g, h9 is a vector oriented surface. field whose components have continuous first partial derivatives on S. Then i For closed curves in the three-dimensional space, one cannot say whether they are counter- clockwise or clockwise as it depends on the direction of observations. Therefore, the ,counter-clockwise convention of the Green’s Theorem is generalized to the right-hand rule F # dr = 1ٌ * F2 # n dS condition in the statement of the Stokes’ Theorem. C O C S i The Stokes’ Theorem applies only on orientable surfaces. That says, it may not hold for surfaces such as the Möbius strip. Also, the condition where the vector field F needs towhere be n is the unit vector normal to S determined by the orientation of S. defined and is continuously differentiable on and near the surface S is crucial. However, we will mostly deal with vector fields that satisfy this condition.

i The condition that S has to be simply-connected is also crucial, but we will later learn how to modify the Stokes’QUICK Theorem CHECK so as to 1 allow Suppose non-simply-connected that S is a surface S. The meaning of Stokes’ Theorem is much the same as for the circulation form of region in the xy-plane with a boundary Green’s Theorem: Under the proper conditions, the accumulated rotation of the vector The proof of the Stokes’ Theorem is omitted here. Interested readers mayfield consult over the the surface S (as given by the normal component of the curl) equals the net cir- textbook for a proof of oneoriented special case. counterclockwise. Let’s look at some examples.What is the normal to S? Explain why Stokes’ culation on the boundary of S. An outline of the proof of Stokes’ Theorem is given at the

Theorem becomes the circulation form end of this section. First, we look at some special cases that give further insight into the of Green’s Theorem. ➤ theorem. If F is a conservative vector field on a domain D, then it has a potential function w ;(such that F = ٌw. Because ٌ * ٌw = 0, it follows that ٌ * F = 0 (Theorem 15.9 therefore, the circulation integral is zero on all closed curves in D. Recall that the circula- tion integral is also a work integral for the force field F, which emphasizes the fact that no work is done in moving an object on a closed path in a conservative force field. Among the important conservative vector fields are the radial fields F = r>͉r͉p, which generally have zero curl and zero circulation on closed curves.

➤ Recall that for a constant nonzero vector EXAMPLE 1 Verifying Stokes’ Theorem Confirm that Stokes’ Theorem holds for 8 9 a and the position vector r = x, y, z , the vector field F = 8z - y, x, -x9, where S is the hemisphere x 2 + y 2 + z 2 = 4, for = * the field F a r is a rotational field. z Ú 0, and C is the circle x 2 + y 2 = 4 oriented counterclockwise. In Example 1, SOLUTION The orientation of C says that the vectors normal to S point in the out- F = 80, 1, 19 * 8x, y, z9. ward direction. The vector field is a rotation field a * r, where a = 80, 1, 19 and r = 8x, y, z9; so the axis of rotation points in the direction of the vector 80, 1, 19 Axis of rotation (Figure 15.60). We first compute the circulation integral in Stokes’ Theorem. The z of F is ͗0, 1, 1͘. curve C with the given orientation is parameterized as r1t2 = 82 cos t, 2 sin t, 09, 2 2 2 S: x ϩ y ϩ z ϭ 4 ϭ ͗ Ϫ Ϫ ͘ … … p Ј1 2 = 8- 9 Ն F z y, x, x for 0 t 2 ; therefore, r t 2 sin t, 2 cos t, 0 . The circulation integral is z 0 n 2 # 2p # F dr = F rЈ1t2 dt Definition of line integral C L0 C 2p # = 8z - y, x, -x9 8-2 sin t, 2 cos t, 09 dt Substitute. e

y L d C 0 2 cos t -2 sin t x 2p FIGURE 15.60 = 41sin2 t + cos2 t2 dt Simplify. L0 4.6 Stokes’ Theorem 121

2 2 2  Example 4.16 Let S be the hemisphere x + y + z = 4, z ≥ 0 above the the xy-plane, and C be its boundary curve oriented counter-clockwise on the xy-plane. Given F = (z − y)i + xj − xk. Determine the line integral

F · dr ˛C using the Stokes’ Theorem.

 Solution The Stokes’ Theorem asserts that:

F · dr = (∇ × F) · ˆn dS. ˛C ¨S Since the RHS is a surface integral, we need to parametrize it first in order to compute it. Using spherical coordinates, the parametrization of S is given by:

r(θ, ϕ) = (2 sin ϕ cos θ)i + (2 sin ϕ sin θ)j + (2 cos ϕ)k

π where 0 ≤ θ ≤ 2π and 0 ≤ ϕ ≤ 2 . Then, ∂r = (−2 sin ϕ cos θ)i + (2 sin ϕ cos θ)j + 0k ∂θ ∂r = (2 cos ϕ cos θ)i + 2 cos ϕ sin θ)j + (−2 sin ϕ)k ∂ϕ ∂r ∂r × = (−4 sin2 ϕ cos θ)i + (−4 sin2 ϕ sin θ)j + (−4 sin ϕ cos ϕ)k ∂θ ∂ϕ ∂r ∂r × = (4 sin2 ϕ cos θ)i + (4 sin2 ϕ sin θ)j + (4 sin ϕ cos ϕ)k ∂ϕ ∂θ

∂r ∂r ∂r ∂r Note that ∂θ × ∂ϕ is pointing downward, so we use ∂ϕ × ∂θ instead. Next we need to compute ∇ × F:

i j k

∇ × F = ∂ ∂ ∂ ∂x ∂y ∂z z − y x −x = 2j + 2k. 122 Vector Calculus

Therefore, using the Stokes’ Theorem, we have

F · dr = (∇ × F) · ˆn dS ˛C ¨S = (2j + 2k) · ˆn dS ¨S π/2 2π  ∂r ∂r  = (2j + 2k) · × dθdϕ (using Theorem 4.5) ˆ0 ˆ0 ∂ϕ ∂θ π/2 2π = (8 sin2 ϕ sin θ + 8 sin ϕ cos ϕ) dθdϕ ˆ0 ˆ0 π/2 ! 2π ! π/2 = 8 sin2 ϕ dϕ sin θ dθ +2π 4 sin 2ϕ dϕ ˆ0 ˆ0 ˆ0 | {z } =0  1 π/2 = 8π − cos 2ϕ = 8π. 2 0

i Although the Stokes’ Theorem was used in the previous example (as required by the problem), it is actually easier to compute the line integral directly by parametrizing the curve: Let r(t) = (2 cos t)i + (2 sin t)j + 0k where 0 ≤ t ≤ 2π. Then

F · dr = F · r0(t) dt ˛C ˛C 2π = ((z − y)i + xj − xk) · ((−2 sin t)i + (2 cos t)j + 0k) dt ˆ0 2π = (−2(z − y) sin t + 2x cos t) dt ˆ0 2π = −2(−2 sin t) sin t + 2(2 cos t) cos t dt ˆ0 2π = 4(sin2 t + cos2 t) dt = 8π. ˆ0

The purpose of the previous example is simply to illustrate how to use the Stokes’ Theorem, although it is not necessary to use it. The line integral in the next example, however, would be extremely difficult to compute without the Stokes’ Theorem.

 Example 4.17 Let C be the curve of intersection of the plane Ax + By + Cz = 0 and the sphere x2 + y2 + z2 = a2. Show that

πa2(A + B + C) ydx + zdy + xdz = ± √ ˛C A2 + B2 + C2 where ± is determined by the orientation of C.

 Solution The plane Ax + By + Cz = 0 passes through the origin. Therefore, the curve C is a great circle on the sphere. However, this great circle is neither horizontal or vertical, so it is difficult to parametrize C to compute the line integral. Let’s use the Stokes’ Theorem to see if there is any luck! When using the Stokes’ Theorem, one needs to pick a surface S whose boundary curve is C. There are three choices: 1. region of the plane enclosed by C 2. the hemisphere above the plane 3. the hemisphere below the plane 4.6 Stokes’ Theorem 123

All of the above choice should give the same conclusion. However, let’s pick the region of the plane enclosed by C to be the surface S and we will explain why it is the smartest choice among all three. Now the given line integral is associated to the vector field F = yi + zj + xk, i.e.

ydx + zdy + xdz = F · dr. ˛C ˛C In order to apply the Stokes’ Theorem, we need to compute the curl:

i j k

∇ × F = ∂ ∂ ∂ ∂x ∂y ∂z y z x = −i − j − k.

We also need the unit normal vector ˆn, but since the region S is a plane whose equation is Ax + By + Cz = 0. The unit normal is a constant vector given by:

Ai + Bj + Ck ˆn = ± √ A2 + B2 + C2 where ± is determined by the orientation of C. Therefore,

 Ai + Bj + Ck  A + B + C (∇ × F) · ˆn = ± (−i − j − k) · √ = ± √ . A2 + B2 + C2 A2 + B2 + C2 Next, we apply the Stokes’ Theorem on these C and S:

F · dr = (∇ × F) · ˆn dS ˛C ¨S A + B + C = ± √ dS ¨S A2 + B2 + C2 A + B + C = ± √ dS A2 + B2 + C2 ¨S | {z } surface area πa2(A + B + C) = ± √ . A2 + B2 + C2

Note that the surface area of the region S on the plane is πa2, since its boundary is a circle with radius a. There are two major reasons why the part of the plane enclosed by C is a smarter choice for S than the hemispheres. For one thing, both ∇ × F and ˆn are constant vector field if S is chosen to be a planar region, so that computing its surface integral is very easy – no parametrization! For another, if any of the hemispheres were chosen to be S, then the surface integral needs to be computed by parametrization – which can be tedious. It is also very difficult to determine the range of values of ϕ and θ since the plane cutting the sphere is not a horizontal one.

Occasionally, the Stokes’ Theorem can be applied to evaluate a surface integral over an arbitrary or complicated surface, which is not easy to be parametrized. If the given vector field G can be expressed in the form of F = ∇ × G for another vector field G, by the (backward) Stokes’ Theorem asserts that:

F · ˆn dS = (∇ × G) · ˆn dS = G · dr ¨S ¨S ˛C 124 Vector Calculus

where C is the boundary curve of the surface S. Very often, the line integral is easier to compute if one can parametrize it.

i Note that the above discussion holds only when the given vector field F is of the form F = ∇ × G. If such an F is not in this form, there is no easy way to apply the Stokes’ Theorem backward.

 Example 4.18 Let C be an arbitrary simple closed curve on the xy-plane in the xyz-space, and S be an arbitrary surface above the xy-plane with boundary curve C. z y  1. Verify that i = ∇ × − 2 j + 2 k . 2. Show that: i · ˆn dS = 0. ¨S

 Solution Part (1) is straight-forward:

i j k  z y  ∂ ∂ ∂ ∇ × − j + k = 2 2 ∂x ∂y ∂z z y 0 − 2 z  ∂  y  ∂  z  = − − i − 0j + 0k ∂y 2 ∂z 2 = i

z y For part (2), we denote G = − 2 j + 2 k for simplicity. Then i = ∇ × G. Applying the Stokes’ Theorem backward, we get:

i · ˆn dS = (∇ × G) · ˆn dS ¨S ¨S = G · dr. ˛C Since the curve C is arbitrary in nature, there is no way to parametrize C. However, it is given that C is on the xy-plane! Therefore, one can use the Green’s Theorem to evaluate the above line integral. Denote R to be the region on the xy-plane enclosed by the curve C, then the Green’s Theorem asserts that:

G · dr = (∇ × G) · k dA ˛C ¨R = i · k dA = 0 dA = 0. ¨R ¨R

4.6.2 Significance of the Stokes’ Theorem Interpretation of Curl Using the Stokes’ Theorem, one can give a geometric interpretation of ∇ × F. Consider a tiny surface S with boundary curve C. Denote ˆn to be the unit normal vector of S with direction determined by the right-hand rule. Since the surface S is very small, one can regard the quantity (∇ × F) · ˆn is nearly a constant over the surface S. By the Stokes’ Theorem, we have:   F · dr = (∇ × F) · ˆn dS ' (∇ × F) · ˆn dS . ˛C ¨S ¨S

Therefore, F · dr. (∇ × F) · ˆn ' C . Surface¸ area of S 4.6 Stokes’ Theorem 125

This quantity is large when the vector field F is circular about the normal vector ˆn. In other words, the quantity (∇ × F) · ˆn measures the circulation density around any given point. That’s why ∇ × F is often called the curl of F.

Page 6 Conservative Vector Fields Recall that a vector field F is conservative if there exists a scalar function f such that F = ∇ f . If F is defined and continuously differentiable everywhere (or on a simply-connected domain), then the curl test asserts that F is conservative if and only if ∇ × F = 0. Using the Stokes’ Theorem, if F is conservative, then:

F · dr = (∇ × F) · ˆn dS = 0, ˛C ¨S | {z } =0

which recovers the result we proved before in Theorem 4.1. Of course, the Stokes’ Theorem as- serts more than Theorem 4.1 does because it applies to any vector field, not just the conservative ones.

4.6.3 Surfaces with Multiple Boundaries When applying the Stokes’ Theorem on surfaces with multiple boundaries (i.e. not simply- connected), such as the one in Figure 4.17, one needs to be very careful when dealing with the inner boundary. The form of the Stokes’ Theorem, as stated in Theorem 4.6, applies only to simply-connected regions. However, using a similar technique illustrated in the subsection about the winding number, one can extend the Stokes’ Theorem so that it can be applied to surfaces with holes as well. Take the region in Figure 4.17 as an example. One can subdivide the surface by cutting along arcs that connect the outer boundary and the inner boundaries.

Figure 4.17: apply Stokes’ Theorem for surfaces with holes

Each sub-surface of S1, S2 and S3 now becomes simply-connected and so the Stokes’ Theorem applies to each sub-surface: ! + − + F · dr = (∇ × F) · ˆn dS ˆ ˆ ˆ − ˆ ¨ C1 L1 Γ1 L2 S1 | {z } form the boundary of S1 ! + − + + − − − F · dr = (∇ × F) · ˆn dS ˆ ˆ ˆ − ˆ ˆ ˆ ˆ + ˆ ¨ C2 L3 Γ2 L4 C4 L2 Γ1 L1 S2 | {z } form the boundary of S2 ! − − − F · dr = (∇ × F) · ˆn dS. ˆ ˆ ˆ + ˆ ¨ C3 L4 Γ2 L3 S3 | {z } form the boundary of S3 126 Vector Calculus

Summing up all three equations and cancelling out the Li’s terms, we get: ! + + + − − − − F · dr ˆ ˆ ˆ ˆ ˆ − ˆ + ˆ − ˆ + C1 C2 C3 C4 Γ1 Γ1 Γ2 Γ2   = + + (∇ × F) · ˆn dS ¨ ¨ ¨ S1 S2 S3

Combining all Ci’s, Γi’s and Si’s, we yield:

F · dr− F · dr − F · dr = (∇ × F) · ˆn dS. ˛ ˛ ˛ ¨ C Γ1 Γ2 S While the surface integral (RHS) is in the same form as in the usual Stokes’ Theorem, the LHS is not summing up the boundary line integrals, but rather the outer boundary has a plus sign in front and the inner boundaries each has a minus sign in front. For more complicated surfaces (with many, but finitely many, holes), one can apply the technique illustrated above to establish: Theorem 4.7 — Stokes’ Theorem for Higher Genusa Surfaces. Let S be an orientable surface 3 in R with n holes. Denote C to be its outer boundary, and Γ1, Γ2, ... , Γn to be its inner boundaries. Suppose F is a vector field is defined and continuously differentiable on and near the surface S, then:

n F · dr − F · dr = (∇ × F) · ˆn dS. ˛ ∑ ˛ ¨ C i=1 Γi S Here ˆn is the unit normal vector to S with orientation determined by the right-hand rule applied to the outer boundary C.

aThe word genus is the mathematical term for number of “holes” inside the surface. 4.7 Divergence Theorem 127 4.7 Divergence Theorem The Green’s and Stokes’ Theorems relate the line integral of a vector field over a simple closed curve with a double/surface integral of the curl of the vector field. In this section, we are going to learn the Divergence Theorem, which relates the surface integral over a closed surface with a triple integral over the solid region enclosed by the surface.

4.7.1 Divergence Operator In order to state the Divergence Theorem, we need to define: Definition 4.8 — Divergence Operator. Given a continuously differentiable vector field F in R3 whose components in rectangular coordinates are:

F = Fxi + Fyj + Fzk,

the divergence of F, denoted by ∇ · F, is defined as:

∂F ∂Fy ∂F ∇ · F = x + + z . ∂x ∂y ∂z

Note that if the vector field F is given in spherical coordinates, the divergence of F is not defined as: ∂Fρ ∂F ∂Fϕ ∇ · F = + θ + WRONG! ∂ρ ∂θ ∂ϕ Using the chain rule and the conversion rules of components, one can convert ∇ × F from the rectangular coordinate form to both spherical and cylindrical forms:

1 ∂   1 ∂ 1 ∂F ∇ · F = ρ2F + (F sin ϕ) + θ ρ2 ∂ρ ρ ρ sin ϕ ∂ϕ θ ρ sin ϕ ∂θ 1 ∂ 1 ∂F ∂F = (rF ) + θ + z . r ∂r r r ∂θ ∂z

The derivation, although straight-forward, is very tedious and is therefore omitted here. We will see the geometric interpretation of ∇ × F after we state the Divergence Theorem. Essentially, it measures how diverging the vector field is.

4.7.2 Divergence Theorem for Solids without Holes Theorem 4.8 — Divergence Theorem. Let S be a closed orientable surface enclosing a simply- connected solid region D. Suppose F is a vector field defined and being continuously differentiable in and near the region D, then we have:

F · ˆn dS = ∇ · F dV . ‹S ˚D | {z } | {z } surface integral triple integral

Here ˆn is the outward normal of S.

The Divergence Theorem is particularly useful for computing the flux over a closed surface, as the theorem says we do not need to parametrize the surface and compute the normal vector, and all is needed is to compute a triple integral instead, which is often easier than computing surface integrals. Let’s look at some examples. 128 Vector Calculus

 Example 4.19 Consider the vector field F = 3xi + 4yj − 5zk. Let S be the sphere with radius a centered at the origin. Evaluate the flux integral:

F · ˆn dS ‹S where ˆn is the outward normal.

 Solution You can imagine the computation would be quite tedious if we computed this flux integral directly by parametrizing the sphere. However, since this is a flux integral over a closed surface, one can try to use the Divergence Theorem to compute it! By easy computations, we have:

∂ ∂ ∂ ∇ · F = (3x) + (4y) + (−5z) = 3 + 4 − 5 = 2. ∂x ∂y ∂z

Denote D to be the solid sphere with radius a centered at the origin, i.e. the solid region enclosed by S. By the Divergence Theorem, we have:

4 8 F · ˆn dS = ∇ · F dV = 2 dV = 2 · πa3 = πa3. ‹S ˚D ˚D 3 3

2 x  Example 4.20 Let F = x i + 4xyzj + ze k, D be the rectangular box defined by 0 ≤ x ≤ 3, 0 ≤ y ≤ 2 and 0 ≤ z ≤ 1 and S be the boundary surface of the D (i.e. S is the shell of the box). Evaluate the flux integral: F · ˆn dS ‹S where ˆn is the outward normal.

 Solution The closed surface S has six faces! If one attempts to compute the flux integral directly, one needs to split it into six integrals, corresponding to each of its six faces. However, if one applies the Divergence Theorem, the difficult surface integral becomes a triple integral over a rectangular region which is very easy to set-up. We first compute:

∂ ∂ ∂ ∇ · F = (x2) + (4xyz) + (zex) = 2x + 4xz + ex. ∂x ∂y ∂z

By the Divergence Theorem, we get:

F · ˆn dS = ∇ · F dV ‹S ˚D z=1 y=2 x=3 = (2x + 4xz + ex) dxdydz ˆz=0 ˆy=0 ˆx=0 = 34 + 2e3.

We omit the computational detail of the triple integral above, which is a very straight-forward. 4.7 Divergence Theorem 129

Let’s demonstrate one example using the cylindrical form of the divergence.

2  Example 4.21 Let F = r(2 + sin θ)er + (r sin θ cos θ) eθ + 3zk, and S be the quarter-cylinder r = 2, 0 ≤ θ ≤ π and 0 ≤ z ≤ 5. Compute the surface flux:

F · ˆn dS. ‹S

 Solution The cylinder has 5 faces. It would be time-consuming to compute the total flux face-by-face directly. Let’s try to use the Divergence Theorem (since S is a closed surface). The vector field F is given in cylindrical form, and its components are:

 2  Fr = r 2 + sin θ , Fθ = r sin θ cos θ, Fz = 3z.

Therefore, we have:

1 ∂   1 ∂ ∂(3z) ∇ · F = r2(2 + sin2 θ) + (r sin θ cos θ) + r ∂r r ∂θ ∂z 1 = · 2r(2 + sin2 θ) + (cos2 θ − sin2 θ) + 3 r = 4 + 2 sin2 θ + cos2 θ − sin2 θ = 4 + cos2 θ + sin2 θ = 5.

Therefore, by the Divergence Theorem,

π(2)2(5) ∇ · F dS = ∇ · F dV = 5 dV = 5 · = 25π. ‹ ˚ ˚ 4 S D D | {z } volume ofthe cylinder

Here D denotes the solid enclosed by S.

One should note that the surface integral stated in the Divergence Theorem is

F · ˆn dS ‹S but not of (∇ × F) · ˆn. In fact, using the Divergence Theorem, one can show that

(∇ × F) · ˆn dS = 0 ‹S for any continuously differentiable vector field F. It is because:

(∇ × F) · ˆn dS = ∇ · G dV ‹S | {z } ˚D =:G = ∇ · (∇ × F) dV. ˚D We next show that ∇ · (∇ × F) = 0 for any continuously differentiable vector field F.       ∂F ∂Fy ∂F ∂F ∂Fy ∂F ∇ × F = z − i + x − z j + − x k ∂y ∂z ∂z ∂x ∂x ∂y       ∂ ∂F ∂Fy ∂ ∂F ∂F ∂ ∂Fy ∂F ∇ · (∇ × F) = z − + x − z + − x ∂x ∂y ∂z ∂y ∂z ∂x ∂z ∂x ∂y 2 2 2 2 2 2 ∂ F ∂ Fy ∂ F ∂ F ∂ Fy ∂ F = z − + x − z + − x . ∂x∂y ∂x∂z ∂y∂z ∂y∂x ∂z∂x ∂z∂y 130 Vector Calculus

Using the Mixed Partials Theorem, all of the above second derivatives are canceled out, and so ∇ · (∇ × F) = 0. Therefore, we get:

(∇ × F) · ˆn dS = 0. ‹S

4.7.3 Interpretation of Divergence Operator By taking a tiny solid region D with boundary surface S, then the Divergence Theorem applied to a vector field F asserts that:

F · ˆn dS = ∇ · F dV. ‹S ˚D

When the region D is very small, one can regard ∇ × F is nearly a constant, and so we have:

F · ˆn ' (∇ · F) × volume of D. ‹S

This result gives a geometric interpretation of ∇ · F:

1 ∇ · F ' F · ˆn dS. volume of D ‹S

In other words, ∇ × F measures the flux density near a point. The more diverging F is around a point, the higher the flux over a tiny closed surface around that point, resulting in greater value of ∇ · F. This justifies the use of name divergence for ∇ · F.

4.7.4 Limitations of Divergence Theorem The condition that F has to be defined in the region D is crucial. Consider the gravitational force field: GMm F = − e . ρ2 ρ

If S is the sphere with radius a centered at the origin, then its outward unit normal is simply eρ and so

1 GMm · = − · = − · 2 = − F ˆn dS GMm 2 eρ eρ dS 2 4πa 4πGMm. ‹S ‹S ρ a

If one attempts to apply the Divergence Theorem on this vector field, then by the spherical form of the divergence operator, we can easily see that:

1 ∂  1  1 ∂ ∇ · F = −GMm ρ2 · = −GMm (1) = 0 ρ ∂ρ ρ2 ρ ∂ρ

whenever ρ 6= 0, i.e. everywhere except the origin. The triple integral

∇ · F dV ˚D is an improper integral since D (which is the solid sphere enclosed by S) contains the origin at which F is undefined! To conclude, one needs to be very careful when applying the Divergence Theorem if the region D contains some points at which the vector field is not defined. Page 5

4.7 Divergence Theorem 131

4.7.5 Gauss’s Law for Gravity The purpose of this subsection is to give a proof of the Gauss’s Law for Gravity (assuming the inverse-square law), which says that the gravitational flux: GMm · 2 eρ ˆn dS ‹S ρ is given by 4πGMm for any closed surface S enclosing the origin. We have shown that it is so when S is a sphere centered at the origin, and we are going to use the Divergence Theorem to show that it is always true for any closed surface S enclosing the origin. However, we need to be very careful when applying the Divergence Theorem since the gravitational field is undefined at the origin. We will adopt the “hole-drilling” technique which was previously used in computing the winding number integral. Given a solid D containing the origin, we first construct a small sphere B with radius a centered at the origin. Then, the solid D\B (i.e. the solid D with B removed) is a solid not enclosing the “bad” point origin. Next, we cut this solid into two parts by the horizontal plane z = 0. Label each side of the resulting solids by Si, Π and Σi as shown in the Figure 4.18. Note that Π is the common side.

Figure 4.18: applying Divergence Theorem on the gravitational force field

Gluing S1, Π and Σ1 together gives a closed surface not enclosing the origin. Denote D1 to be the solid enclosed by this closed surface. Hence, one can apply the Divergence Theorem without any issue:

  GMm  GMm  + + e · ˆn dS = ∇ · e dV = 0 ¨ ¨ ¨ 2 ρ ˚ 2 ρ S1 Π Σ1 ρ D1 ρ | {z } =0

where ˆn is the outward unit normal of the boundary surface of D1. Denote ˆnup and ˆndown to be the upward and downward normal vector respectively. The above integrals can be expressed as: GMm GMm GMm e · ˆn dS + e · ˆn dS + e · ˆn dS = 0. (4.3) ¨ 2 ρ up ¨ 2 ρ down ¨ 2 ρ down S1 ρ Π ρ Σ1 ρ

Similarly, gluing S2, Π and Σ2 together gives a closed surface not enclosing the origin. By the Divergence Theorem applied to this surface, we get: GMm GMm GMm e · ˆn dS + e · ˆn dS + e · ˆn dS = 0. (4.4) ¨ 2 ρ down ¨ 2 ρ up ¨ 2 ρ up S2 ρ Π ρ Σ2 ρ 132 Vector Calculus

We then add up the above two equations. First note that S1 and S2 can glue together to form the closed surface S. Both ˆnup of S1, and ˆndown of S2 become the outward normal of S. Therefore, GMm GMm GMm e · ˆn dS + e · ˆn dS = e · ˆn dS. ¨ 2 ρ down ¨ 2 ρ up ¨ 2 ρ outward S1 ρ S2 ρ S ρ

For the planar surface Π, the downward normal ˆndown is in the opposite direction of the upward normal ˆnup, i.e. ˆndown = − ˆnup. Therefore,

GMm GMm · + · = 2 eρ ˆndown dS 2 eρ ˆnup dS 0. ¨Π ρ ¨Π ρ

Finally, the surfaces Σ1 and Σ2 glue together to form the closed sphere Σ. The normal vectors ˆndown of Σ1, and ˆnup of Σ2, are the inward unit normal of Σ. Therefore,

GMm GMm GMm e · ˆn dS + e · ˆn dS = e · ˆn dS. ¨ 2 ρ down ¨ 2 ρ up ¨ 2 ρ inward Σ1 ρ Σ2 ρ Σ ρ Summing up (4.3) and (4.4), we get:

GMm GMm · + + · = 2 eρ ˆnoutward dS 0 2 eρ ˆninward dS 0. ‹S ρ ‹Σ ρ Therefore, GMm GMm · = − · 2 eρ ˆnoutward dS 2 eρ ˆninward dS ‹S ρ ¨Σ ρ GMm = · 2 eρ ˆnoutward dS ‹Σ ρ = 4πGMm (computed before)

This holds true for any closed surface S enclosing the origin. This proves the Gauss’s Law for Gravity (assuming the inverse-square law). However, if S does not enclose the origin, then one can apply the Divergence Theorem on the gravitational vector field without any issue. To conclude, for any closed surface S not passing through the origin, we have: ( 1 4π if S encloses the origin; · = 2 eρ ˆn dS ‹S ρ 0 otherwise.

In the spirit of the Divergence Theorem, i.e.

1  1  · = ∇ · 2 eρ ˆn dS 2 eρ dV, ‹S ρ ˚D ρ we can decree that (  1  4π if D contains the origin; ∇ · = 2 eρ dV ˚D ρ 0 otherwise   ∇ · 1 e even though the integrand ρ2 ρ is undefined at the origin. This fact motivates the introduction of Dirac delta functions to be discussed in the next chapter. 5 — Topics in Physics and Engineering

5.1 Coulomb’s Law An electric field, commonly denoted by E, is a vector field in R3. At every point (x, y, z), the vector E(x, y, z) represents the force that would be exerted on a stationary test particle of unit positive charge. An electric field can be generated by the presence of electric charges, or time-varying magnetic forces. We are going to discuss the former in this section. A magnetic field, commonly denoted by B, is a vector field in R3 that describes the direction and strength of magnetic force exerted at each point. Both electric current and presence of magnetic material can generate a magnetic field.

Figure 5.1: electric fields generated by point charges

Figure 5.2: a magnetic field generated by a bar magnet 134 Topics in Physics and Engineering

5.1.1 of Point Charges A point charge located at the origin generates an E-field according to the Coulomb’s Inverse- Square Law: q = E 2 eρ, 4πε0ρ

where q is the charge of the point particle, and ε0 is a positive constant depending on the background material (i.e. permittivity). Clearly, this E-field is radially symmetric, and whether it points inward or outward depends on the sign of q. If there is another point particle with charge Q, then the electric force exerted by the q-particle on the Q-particle is given by:

qQ = = F QE 2 eρ 4πε0ρ

where ρ and eρ are evaluated at the point of the Q-particle. Therefore, if both q and Q are of the same sign (i.e. like charges), the force F exerted on the Q-particle is pointing outward from the origin. This indicates the two charges repulse each other. On the other hand, if q and Q are of different sign, the force F exerted on the Q-particle is pointing inward, indicating an attraction. Note that the above E-field is not defined at the origin. Physically speaking, it can be interpreted as having highly concentrated charge density at the origin. We leave it as an exercise for readers to verify that:

∇ · E = 0 ∇ × E = 0

everywhere except the origin. Since R3 with the origin removed is a simply-connected region, we can still apply the curl test to claim that E is conservative. In spherical coordinates, the gradient operator is given by:

∂ f 1 ∂ f 1 ∂ f ∇ f = e + e + e . ∂ρ ρ ρ ∂ϕ ϕ ρ sin ϕ ∂θ θ

In physics, we usually define potential energy according to the negative convention: E = −∇V. Set: q ∂V 1 ∂V 1 ∂V = −∇ = − − − 2 eρ V eρ eϕ eθ. 4πε0ρ ∂ρ ρ ∂ϕ ρ sin ϕ ∂θ By equating the components, we have

q ∂V = − 2 4πε0ρ ∂ρ ∂V 0 = − ∂ϕ ∂V 0 = − ∂θ Clearly, V depends only on ρ, and by the first equation, we have:

q q = − = + V 2 dρ constant. ˆ 4πε0ρ 4πε0ρ

By picking the constant to be zero (so that the potential energy is 0 at infinity), the electric potential energy due to the q-particle is therefore given by: q V = . 4πε0ρ 5.1 Coulomb’s Law 135

The above discussion assumes the point charge is located at the origin, so that the E-field generated can be conveniently expressed using spherical coordinates. Using rectangular coordinates, the above E-field and the electric potential V are given by:

q (xi + yj + zk) E = 2 2 2 3/2 4πε0 (x + y + z ) q V = p 2 2 2 4πε0 x + y + z

If the point charge is located at (x0, y0, z0), then the E-field generated and its electric potential V are simply translations of the above, i.e.

q ((x − x )i + (y − y )j + (z − z )k) q r − r E = 0 0 0 = · 0 2 2 2 3/2 πε 3 4πε0 ((x − x0) + (y − y0) + (z − z0) ) 4 0 |r − r0| q q V = = p 2 2 2 4πε0 (x − x0) + (y − y0) + (z − z0) 4πε0 |r − r0|

where r = xi + yj + zk and r0 = x0i + y0j + z0k. If there are more than one point charges, the resulting E-field generated by these point charges is the vector sum of all E-fields generated by these charges. Precisely, if these particles are located at r1, r2, ... with charges q1, q2, ... correspondingly, then the overall E-field is given by: q r − r q r − r q r − r = 1 · 1 + 2 · 2 + = i · i E 3 3 ... ∑ 3 . 4πε0 |r − r1| 4πε0 |r − r2| i 4πε0 |r − ri| Physicists often call this rule the Principle of Superposition.

5.1.2 Electric Flux For a uniform E-field across a flat surface with unit normal ˆn and surface area A, the electric flux through this surface is defined to be:

(E · ˆn) A

For a non-uniform E-field across a curved surface S, the electric flux through the surface is defined using a surface integral:

ΦE = E · ˆn dS. ¨S Recall that the E-field generated by a point particle with charge q located at the origin is given by: q = E 2 eρ. 4πε0ρ Given a closed surface S, we have computed in the previous chapter that: ( 1 4π if S encloses the origin; · = 2 eρ ˆn dS ‹S ρ 0 otherwise.

Therefore, the electric flux through S due to a particle located at the origin is given by:

( q q q eρ 4π = if S encloses the origin; = · = 4πε0 ε0 ΦE 2 ˆn dS ‹S 4πε0 ρ 0 otherwise.

By translation invariance, the electric flux through a closed surface S due to a particle located at any position is given by:

( q ε if S encloses the particle; ΦE = 0 0 otherwise. 136 Topics in Physics and Engineering

For more than one point charges (say with charges q1, q2, ...), the overall E-field is given by

E = E1 + E2 + ... where Ei is the electric field generated by the qi-particle. Therefore, the resulting electric flux through a closed surface S is given by:

Φ = (E + E + ...) · ˆn dS = E · ˆn dS E ‹ 1 2 ∑ ‹ i S i S

For each i, the flux integral Ei · ˆn dS is zero if S does not enclose the qi-particle, and is equal ‹S to qi if S encloses the q -particle. Therefore, by summing up each individual electric flux, we ε0 i conclude that: Qenc ΦE = ε0 where Qenc is the sum of charges enclosed by S. It is commonly called the Gauss’s Law for Electricity (in the case of discrete charges). From practical viewpoint, a continuous charge distribution (i.e. charge cloud) is composed of densely packed discrete particles. Therefore, physicists assert that the Gauss’s Law for Electricity is true for any (discrete or continuous) charge distribution:

Q E · ˆn dS = enc ‹S ε0 where Qenc is the amount of charges enclosed by the closed surface S. 5.2 Introduction to Maxwell’s Equations 137 5.2 Introduction to Maxwell’s Equations 5.2.1 Integral Form Maxwell’s Equations are a set of four partial differential equations (PDE) which govern the interaction of E-fields, B-fields and currents. Each of the four has both an integral form and a differential form. The two forms are equivalent to each other by the Stokes’ and Divergence Theorems discussed in the previous chapter. Given any closed surface S enclosing a solid region D, and simply-connected surface Σ with a simple closed boundary curve C, the Maxwell’s Equations in integral form are:

Q E · ˆn dS = (Gauss’s Law for Electricity) ‹S ε0 B · ˆn dS = 0 (Gauss’s Law for Magnetism) ‹S ∂ E · dr = − B · ˆn dS (Faraday’s Law) ˛C ∂t ¨Σ  ∂  B · dr = µ0 J · ˆn dS + ε0 E · ˆn dS (Ampère’s Law) ˛C ¨Σ ∂t ¨Σ

Here E is the electric field, B is the magnetic field and J is the current. Q is the total amount of charges enclosed by the surface S. Both µ0 and ε0 are positive constants. We have discussed the Gauss’s Law for Electricity in the previous section. It can be derived by assuming the Coulomb’s Inverse-Square Law. The Gauss’s Law for Magnetism asserts that the magnetic flux through any closed surface must be zero! In physical terms, it means the amount of magnetic field entering the surface is exactly equal to that leaving the surface.

Figure 5.3: Gauss’s Law for Magnetism

Under the assumption that the Gauss’s Law for Magnetism is true, magnetic monopoles (i.e. magnets with only one pole) are ruled out. It is because a magnet with only one pole will generate a B-field similar to the E-field generated by a point charge, which is radially outward or inward, giving non-zero flux (positive if radially outward, negative if radially inward) on a surface enclosing the magnetic monopole. It would lead to a contradiction to the Gauss’s Law for Magnetism. Easy experiments tell us that breaking an ordinary bar magnet cannot create a magnetic monopole. Instead, the magnet will be broken into two magnets each of which contains both 138 Topics in Physics and Engineering

Figure 5.4: breaking a bar magnet does not create magnetic monopoles

south and north poles. If in someday a magnet monopole is discovered (maybe using some exotic materials), the Gauss’s Law for Magnetism will need to be modified. The Faraday’s and Ampère’s Laws in integral form look a bit more complicated than the two Gauss’s Laws. We will discuss them after we derive the differential form of Maxwell’s Equations.

5.2.2 Differential Form Using the Stokes’ and Divergence Theorems, one can convert the integral form of Maxwell’s Equations to the differential form, which is more elegant to state and easier to work with: $ ∇ · E = (Gauss’s Law for Electricity) ε0 ∇ · B = 0 (Gauss’s Law for Magnetism) ∂B ∇ × E = − (Faraday’s Law) ∂t ∂E ∇ × B = µ J + µ ε (Ampère’s Law) 0 0 0 ∂t To begin with, let’s describe a common technique that is often applied when converting an integral expression to a differential one: Theorem 5.1 Suppose f (x, y, z) and g(x, y, z) are two continuous functions such that

f (x, y, z) dV = g(x, y, z) dV ˚D ˚D

for any solid region D in R3, then the functions f and g are identical, i.e. f (x, y, z) = g(x, y, z) for any (x, y, z) in R3.

Proof. Here we present a heuristic, rather than a rigorous, proof. Since it is assumed that

f (x, y, z) dV = g(x, y, z) dV ˚D ˚D for any solid region D, it is in particular true for tiny ones. Take D to be a very tiny region in R3, then both f and g are approximately constants over the region D. Therefore,

f (x, y, z) dV ' g(x, y, z) dV. ˚D ˚D

By canceling out D dV on both sides, we get f (x, y, z) = g(x, y, z) over the region D. Since this tiny region D˝is arbitrarily chosen, one can repeat the same argument with other tiny regions to conclude that f (x, y, z) = g(x, y, z) over the whole R3.  5.2 Introduction to Maxwell’s Equations 139

One can use this theorem and the Divergence Theorem to derive the differential form of Gauss’s Laws. The Gauss’s Law for Magnetism asserts that:

B · ˆn dS = 0 ‹S for any closed surface S. Denote D to be the solid enclosed by S, then the Divergence Theorem asserts that: ∇ · B dV = B · ˆn dS = 0 = 0 dV. ˚D ‹S ˚D Note that S, and hence D, are arbitrary. By Theorem 5.1, we have:

∇ · B = 0.

This equation is called the differential form of the Gauss’s Law for Magnetism. One can also convert the Gauss’s Law for Electricity into a differential form by introducing the charge density function $, which infinitesimally measures the amount of charges per unit volume. The amount of charge Q in a solid region D is related to the charge density by:

Q = $ dV. ˚D Recall that the Gauss’s Law for Electricity asserts that: Q $ E · ˆn dS = = dV ‹S ε0 ˚D ε0 for any closed surface S. Here D is the solid region enclosed by S. Using the Divergence Theorem, we get: $ ∇ · E dV = dV. ˚D ˚D ε0 Since S and hence D are arbitrary, Theorem 5.1 shows: $ ∇ · E = . ε0 It is called the differential form of the Gauss’s Law for Electricity. In physics term, this equation tells us the denser the charges, the more diverging the E-field.

5.2.3 Faraday’s and Ampère’s Laws Similar technique can be applied to derive the differential form of Faraday’s and Ampère’s Laws. The integral form of the Faraday’s Law says that: ∂ E · dr = − B · ˆn dS ˛C ∂t ¨Σ for any simply-connected surface Σ with boundary C. Using Stokes’ Theorem, we have: ∂ ∂B (∇ × E) · ˆn dS = E · dr = − B · ˆn dS = − · ˆn dS. ¨Σ ˛C ∂t ¨Σ ¨Σ ∂t ∂B Since it is true for any surface Σ, one can pick a tiny surface Σ so that (∇ × E) · ˆn and − ∂t · ˆn are approximately constants, and so: ∂B (∇ × E) · ˆn dS = − · ˆn dS. ¨Σ ∂t ¨Σ

By canceling out Σ dS on both sides, we have ˜ ∂B (∇ × E) · ˆn = − · ˆn. ∂t 140 Topics in Physics and Engineering

Since the surface Σ is arbitrary, the unit normal ˆn is also arbitrary. In order for the above to be true, we get the differential form of the Faraday’s Law: Page 6 ∂B ∇ × E = − . ∂t

∂B The Faraday’s Law in differential form tells us the a changing B-field (so that ∂t is a non-zero vector) will induce a E-field. Moreover, the more rapid is the change of B-field, the more circular is the induced E-field.

Figure 5.5: a moving magnet induces a E-field on the wire

Similarly, the Ampère’s Law in integral form can be converted into differential form by Stokes’ Theorem:  ∂  B · dr = µ0 J · ˆn dS + ε0 E · ˆn dS ˛C ¨Σ ∂t ¨Σ  ∂E  (∇ × B) · ˆn dS = µ0J + µ0ε0 · ˆn dS ¨Σ ¨Σ ∂t Since C and Σ are arbitrary, we must have

∂E ∇ × B = µ J + µ ε . 0 0 0 ∂t The physical interpretation of the Ampère’s Law is a bit more complicated than that of Faraday’s Law. For the sake of simplicity, let’s consider only the case where the E-field is unchanging: ∇ × B = µ0J (when E is unchanging) Recall that J is the current. This equation tells us that when the E-field is unchanging, the presence of an electric current in a wire will induce a magnetic field circling around the wire. In fact, the equation ∇ × B = µ0J is the original form of Ampère’s Law. Later Maxwell ∂E found that it is incomplete and correct it by adding a ∂t term.

Figure 5.6: Ampère’s Law 5.3 Heat Diffusion 141 5.3 Heat Diffusion 5.3.1 Derivation of Heat Equation Let u(x, y, z, t) be the temperature at point (x, y, z) at time t. The heat equation:

∂u  ∂2u ∂2u ∂2u  = k + + ∂t ∂x2 ∂y2 ∂z2 governs the diffusion of heat. In this section, we will use several fundamental laws in physics and the Divergence Theorem to derive the heat equation. Heat diffusion is caused by displacement of heat energy. Fourier’s Law in physics asserts that heat energy transfers according to the following rule:

J = −a∇u

where J is a vector field (in energy per second) representing the flow of heat energy, and a is a positive constant depending on the medium. In other words, heat energy diffuses from higher temperature regions to lower ones, and the rate of diffusion is proportional to the magnitude of ∇u. Let D be an arbitrary solid region with boundary surface S. Denote $ to be the energy density function (in energy per volume), which equals to bu for some positive constant b whose value depends on the medium. Then the triple integral:

$ dV ˚D is the total amount of heat energy contained in the region D. On the other hand, the outward flux

J · ˆn dS ‹S measures the amount of heat loss through the closed surface S. By the conservation of heat energy, heat energy must escape through the surface S. In mathematical terms, it is stated as: ∂ $ dV = − J · ˆn dS. ∂t ˚D ‹S The negative sign appears on the RHS because of the outward convention of ˆn. Applying the above physical laws, we get: ∂u b dV = − (−a∇u) · ˆn dS ˚D ∂t ‹S ∂u b dV = a∇u · ˆn dS ˚D ∂t ‹S ∂u b dV = ∇ · (a∇u) dV (Divergence Theorem) ˚D ∂t ˚D Since D is arbitrary, we must have: ∂u b = ∇ · (a∇u) = a∇ · ∇u. ∂t We leave it as an exercise for readers to verify that:

∂2u ∂2u ∂2u ∇ · ∇u = + + . ∂x2 ∂y2 ∂z2 Therefore, we can conclude that: ∂u b  ∂2u ∂2u ∂2u  = + + ∂t a ∂x2 ∂y2 ∂z2 142 Topics in Physics and Engineering

b which is exactly the heat equation by defining k = a . Very often, ∇ · ∇u is denoted by ∇2u or ∆u. As such, the heat equation can be written as: ∂u = k∆u. ∂t

5.3.2 Fundamental Solution It can be verified that the following function satisfies the heat equation:

1  x2 + y2 + z2  Φ(x, y, z, t) = exp − . (4πkt)3/2 4kt

At a point (x, y, z) = (0, 0, 0), we have Φ(0, 0, 0, t) = 1 , and so: (4πkt)3/2 1 lim Φ(0, 0, 0, t) = lim = ∞. t→0 t→0 (4πkt)3/2

2 2 2  x +y +z  3/2 In contrast, if (x, y, z) 6= (0, 0, 0), both exp − 4kt and (4πkt) go to 0 as t → 0. However, the exponential term goes to 0 faster than the t3/2 term, so

1  x2 + y2 + z2  lim Φ(x, y, z, t) = lim exp − = 0 when (x, y, z) 6= (0, 0, 0). t→0 t→0 (4πkt)3/2 4kt

Therefore, the function Φ(x, y, z, t) represents the heat diffusion starting from a highly concen- trated heat source at t = 0. As time goes, the temperature distribution becomes more and more uniform. In general, if the initial temperature distribution is given by the function g(x, y, z), it can be shown (proof beyond the scope of the course) that the following function

∞ ∞ ∞ u(x, y, z, t) = Φ(x − u, y − v, z − w, t)g(u, v, w) dudvdw ˆ−∞ ˆ−∞ ˆ−∞

∂u satisfies the heat equation ∂t = k∆u with initial condition g(x, y, z), meaning that lim u(x, y, z, t) = g(x, y, z). t→0 In other words, the function u(x, y, z, t) predicts how heat diffuses when given an initial temperature profile g(x, y, z). However, the integral defining u(x, y, z, t) is in general difficult to compute explicitly.

5.3.3 Steady State A temperature distribution u(x, y, z, t) is said to be at steady state if it is independent of the ∂u time t, i.e. ∂t = 0. For such a temperature distribution, the heat equation implies that ∆u = 0.

The above equation is often called the Laplace Equation. Now given a closed surface S which encloses a solid region D. Using the Divergence Theorem, one can show that at a steady state if the temperature on the surface S is constant, then the temperature inside the surface S is also a constant. To argue this, we denote u(x, y, z) to be a steady state temperature distribution (i.e. ∆u = 0), and that:

u(x, y, z) = C for any (x, y, z) on S.

Next we consider the vector field (u − C)∇u. We leave it as an exercise for readers to verify from the definition that:

∇ · ((u − C)∇u) = |∇u|2 + (u − C)∆u. 5.3 Heat Diffusion 143

At steady state, we have ∆u = 0 and so ∇ · ((u − C)∇u) = |∇u|2. Next we integrate this result over D: ∇ · ((u − C)∇u) dV = |∇u|2 dV. ˚D ˚D Applying the Divergence Theorem on the LHS, we get:

(u − C)∇u · ˆn dS = |∇u|2 dV. ‹S ˚D From our assumption, we have u = C for any point on S. Therefore, the integrand (u − C)∇u · ˆn of the flux integral on LHS is zero, and so:

0 = |∇u|2 dV. ˚D

Since the integrand |∇u|2 is non-negative, the only scenario for the above to happen is that

∇u(x, y, z) = 0 for any (x, y, z) in D.

Therefore, u must be a constant in the region D, and by continuity, this constant must be C. 144 Topics in Physics and Engineering 5.4 Dirac Delta Functions Recall that the following function

1  x2 + y2 + z2  Φ(x, y, z, t) = exp − (4πkt)3/2 4kt

is the solution to the heat equation starting at t = 0 with a highly concentrated heat source at the origin. As t → 0, we have shown that: ( 0 if (x, y, z) 6= (0, 0, 0); lim Φ(x, y, z, t) = t→0 ∞ otherwise.

The in R3, denoted by δ, is a “function” which can be loosely defined as follows: ( 0 if (x, y, z) 6= (0, 0, 0); δ(x, y, z) = ∞ otherwise. As such, we then have lim Φ(x, y, z, t) = δ(x, y, z) t→0 and so the delta function δ can be thought as the initial condition for Φ.

Figure 5.7: the Dirac delta function can be loosely regarded as the above graph

While the above definition of δ is very neat, it is paradoxical. From mathematical viewpoint, it is a function which is zero almost everywhere in R3. By the definition of integral, the above definition of δ will imply:

∞ ∞ ∞ δ(x, y, z) dxdydz = 0. ˆ−∞ ˆ−∞ ˆ−∞

On the other hand in physics, the integral

∞ ∞ ∞ δ(x, y, z) dxdydz ˆ−∞ ˆ−∞ ˆ−∞

represents the total heat energy. It being 0 means there is no heat energy over the entire R3. As time goes by, the heat distribution becomes Φ since it solves the heat equation and Φ → δ as t → 0. However, it can be computed (using spherical coordinates) that:

∞ ∞ ∞ Φ(x, y, z, t) dxdydz = 1 ˆ−∞ ˆ−∞ ˆ−∞

for any t > 0, meaning the total heat energy in R3 suddenly jumps to 1 as t becomes positive. It contradicts the conservation of energy! 5.4 Dirac Delta Functions 145

To resolve this paradox, mathematicians introduced the concept of generalized functions so that the Dirac delta “function” is regarded as a distribution rather than an ordinary function. The rigorous theory of generalized functions is very advanced and well beyond the scope of this course. To put it in simpler term, the Dirac delta “function” δ is a generalized function that satisfies: ( 1 if D contains the origin δ dV = ˚D 0 otherwise. There is no ordinary function that satisfies the above. Therefore, the Dirac delta “function” is not regarded by mathematicians as a function depsite its name. When we discuss the Gauss’s Law for Gravity, we figured out that: ( 1 4π if S encloses the origin; · = 2 eρ ˆn dS ‹S ρ 0 otherwise.

“Applying” the Divergence Theorem to this vector field, we get:

1  1  · = ∇ · 2 eρ ˆn dS 2 eρ dV. ‹S ρ ˚D ρ

1 e We put “applying” in quote because the vector field ρ2 ρ is not defined at the origin, so rigorously speaking one cannot applying the Divergence Theorem here. By decreeing the Divergence Theorem holds for this vector field, we obtain: (  1  4π if D contains the origin; ∇ · = 2 eρ dV ˚D ρ 0 otherwise.

Therefore, 1  1  ∇ · e 4π ρ2 ρ satisfies the definition of the Dirac delta function, and so one can write:

 1  ∇ · e = 4πδ. ρ2 ρ

According to the Coulomb’s Law, a point particle with charge q at the origin generates a E-field given by: q = E 2 eρ. 4πε0ρ From the above discussion, we get: q q ∇ · E = · 4πδ = δ. 4πε0 ε0

The Gauss’s Law for Electricity in differential form states that ∇ · E = $ where $ is the ε0 charge density. Therefore, using the Dirac delta function (again decree that the Divergence Theorem is true for the above E-field), one can derive:

$ = qδ.

In practice, one can regard qδ as the charge density distribution for a point particle (with charge q).