<<

A Brief Primer: , , and Differential

Tyler D. Robinson

Fall 2019

Copyright © 2019 Tyler D. Robinson. All rights reserved.

1 2

Contents

1 Differentiation3 1.1 Commonly Used Derivatives...... 4 1.2 Properties of Derivatives...... 5 1.3 Derivatives and Extrema...... 6 1.4 Taylor ...... 7 1.5 Partial and Total Derivatives...... 7 1.6 Generalization to Higher Dimensions...... 8

2 Integration9 2.1 Commonly Used Integrals...... 9 2.2 Properties of Integrals...... 10 2.3 Techniques for Solving Integrals...... 11 2.4 Multi-Dimensional Integrals...... 12

3 Differential Equations 12 3.1 Ordinary Differential Equations...... 13 3.2 First-Order Linear Ordinary Differential Equations...... 15 3.3 Partial Differential Equations...... 16

4 A Practicality 16 3

Why calculus? Well, for a lot of reasons to be honest. Differentiation provides an approach to determine of functions. , or rates of change as determined via differentiation, can describe how physical quantities (e.g., energy) build up or are depleted in systems. Integration can be used to determine the net effect of a physical process (via, e.g., taking an over time), and is often used in statistical analysis to “average out” the impact of some nuisance parameter (in a process known as marginalization). Finally, differential equations are fundamental to how we describe systems. While many people claim that differential equations are the language of the Universe, it is likely more proper to say that differential equations are the language we use to describe the Universe. 1 Differentiation The formal definition of the of a single-parameter function relies on investigating the behavior of this function at two nearby points as the distance between these two points is allowed to shrink to zero. Let f(x) be our one-dimensional function, and then the (first) derivative of this function (at location x) is defined by,

df f(x + ∆x) − f(x) = lim , (1) dx ∆x→0 ∆x where it is common to see the simplified notation of, df f 0(x) = . (2) dx What this formalism is actually doing is best described graphically. As shown in Figure1, the definition above describes the steepness () of a line drawn from [x, f(x)] to [x + ∆x, f(x + ∆x)]. So, as ∆x shrinks to zero, the derivative is measuring the local slope of f(x) at location x.

Exercise Use the limit definition of differentiation to evaluate the derivative of the , ex, at x.

If the derivative of a function is another well-behaved function, then you can apply the derivative operator a second time, thereby determining the . Here, one writes, d2f d df = = f 00(x) . (3) dx2 dx dx In general, the nth derivative of a function is written as d nf/dxn, although in physics it is rare to explore beyond a second derivative. 4

1.1 Commonly Used Derivatives In practice, no enterprising physicist uses the limit definition of differentiation to take the derivative of a function. Instead, we all simply commit a handful of derivatives of key functions to memory (or look them up when we forget!). For your convenience, a decent list of useful derivatives is provided here: d cxn = ncxn−1 , (4) dx

d sin x = cos x , (5) dx

d cos x = − sin x , (6) dx

d tan x = sec2 x , (7) dx

d sec x = sec x tan x , (8) dx

d csc x = − csc x cot x , (9) dx

d cot x = − csc2 x , (10) dx

d ex = ex , (11) dx

d ax = ax ln a , (12) dx

d 1 ln(x) = , (13) dx x

d 1 log (x) = . (14) dx a x ln a 5

1.2 Properties of Derivatives The basic derivatives given above, when paired with a few key rules of differentiation, will (nearly always) enable you to take the derivative of a complicated function. Here, the first important rule of differentiation, is that it is distributive, with, d df dg f(x) + g(x) = + , (15) dx dx dx where g(x) is simply another function. Second, derivatives obey the , with,

d dg df f(x) · g(x) = f(x) · + g(x) · . (16) dx dx dx For the product rule, it is often useful to remember, “first times the derivative of the second plus second times the derivative of the first.” The enables us to take derivatives of functions of functions, and is given by, d df dg f g(x) = · = f 0 g(x) · g0(x) . (17) dx dx g(x) dx Finally, the , which actually just follows from the product rule, states,

d f(x) g(x) df − f(x) dg g(x)f 0(x) − f(x)g0(x) = dx dx = . (18) dx g(x) g2(x) g2(x)

Regarding the quotient rule, it is often helpful to remember the phrase, “bottom times the derivative of the top minus top times the derivative of the bottom all over bottom squared.”

Exercise Investigate the derivative of ln x + ln a using two approaches. First, apply the distributive property of differentiation. Second, use the additive properties of logarithms and then apply the derivative.

Exercise Investigate the derivative of ex · ea using two approaches. First, apply the product rule. Second, use the multiplicative properties of exponentials and then apply the derivative. 6

Exercise Investigate the derivative of ln ex using two approaches. First, apply the chain rule for differentiation. Second, simplify the function and then apply the derivative.

Exercise Investigate the derivative of sin x/ cos x using the quotient rule.

1.3 Derivatives and Function Extrema When a function reaches a local minimum or maximum, it ceases to change at the point of the extremum. Or, put another way, the function has no slope. Thus, the zero points of the derivative of a function can be used to find where this function has its maxima and minima. Here, one uses the so-called first and simply finds the roots of, df = 0 . (19) dx How do we know if a root of the expression above is a local minimum or local maximum? We use the second derivative test, which simply states that, for f 0(x) = 0, the function has a local maximum if f 00(x) < 0 or a local minimum if f 00(x) > 0. How do we know this? Let’s first investigate the definition of the second derivative, with,

d f 0(x + ∆x) − f 0(x) f 0(x + ∆x) f 0(x) = lim = lim , (20) dx ∆x→0 ∆x ∆x→0 ∆x where we have used the fact that f 0(x) = 0 in the last step. If f 00(x) > 0, then, for small ∆x, we have, f 0(x + ∆x) > 0 , (21) ∆x which means that, for ∆x < 0 we have f 0(x+∆x) < 0 and for ∆x > 0 we have f 0(x+∆x) > 0. So, for a negative second derivative, f is decreasing when approached from the left and increasing when approached from the right — a local minimum. A diligent student can repeat this proof for a local maximum when f 00(x) < 0.

Exercise Where are the local extrema of sin x? Which are minima and which are maxima? 7

Exercise Take a chemical species in an atmosphere to have an opacity κ and a mass mixing ratio µ. Assume the atmosphere is isothermal with constant pressure scale height H. Here, then, the vertical optical depth (measured from the top of the atmosphere) is given by −z/H −z/H τ(z) = µκHρ0e = τ0e , where z is altitude and ρ0 is the atmospheric mass density at the . Given a source of flux at the top of the atmosphere (e.g., solar flux), the −τ(z) shortwave flux density in this atmosphere is (roughly) given by F (z) = F0e , where F0 is the flux density at the top of the atmosphere. The heating rate (in terms of photon energy deposited per unit per unit time) is then given by Q = dF/dz. Determine the optical depth where the heating rate reaches a maximum. Can you give a physical explanation for your finding?

1.4 In analysis it is often very useful to linearize the behavior of a function. A Taylor series enables this linearization, and actually extends to whatever higher-order term you desire. For a function that is infinitely differentiable, f(x), the polynomial expansion of this function around some point x0 is given by,

df x − x d2f (x − x )2 0 0 f(x0) + + 2 + ..., (22) dx x0 1! dx x0 2! or ∞ n n X d f (x − x0) . (23) dxn x n! n=0 0

Exercise What are the first-order and second-order Taylor series expansions of ex around x = 0?

1.5 Partial and Total Derivatives For multi-variate functions we sometimes need to know their derivative with respect to one variable while all other variables are held fixed, and we sometimes need to know their derivative with respect to one variable while including the response of all other variables to these changes. The former case is called a and the latter is called a . As an example, imagine you are purchasing a car from a foreign country. The total cost of this car depends on the price you negotiate for the automobile as well as the cost to ship 8

the automobile. Naively, you might expect that if the negotiated price increases by ∆c, then the total cost of the car also increases by ∆c — this is the partial derivative. However, in reality you might find that the cost to ship the car also depends on the negotiated price of the car (e.g., maybe through additional insurance costs). The total derivative with respect to negotiated car cost would incorporate both how the negotiated cost directly impacts your total cost as well as how it indirectly impacts your total cost through shipping expenses. Partial and total derivatives are related to one another in a somewhat simple way. For ∂f a two-dimensional function f(x, y), the partial derivative with respect to x is written as ∂x . If y has some dependence on x, then the total derivative with respect to x is given by, df ∂f ∂f dy = + · . (24) dx ∂x ∂y dx

1.6 Generalization to Higher Dimensions Multi-variate functions have an operator akin to the derivative called the . The gradient of a multi-dimensional function is a vector in these higher dimensions, and this vector points in the direction of the greatest rate of increase of the function. The magnitude of this vector gives the rate of change of the function in this direction (i.e., the slope in the direction of the greatest rate of increase). In physics, we typically encounter the gradient in either two- or three-dimensional problems. In cartesian coordinates, the definition of these 2-D and 3-D gradients are, respectively, ∂f ∂f ∇f(x, y) = xˆ + yˆ , (25) ∂x ∂y and ∂f ∂f ∂f ∇f(x, y, z) = xˆ + yˆ + zˆ , (26) ∂x ∂y ∂z where xˆ, yˆ, and zˆ are unit vectors in the x-, y-, and z-directions, respectively.

Exercise The gravitational potential for a point mass M is given by V (r) = −GM/r, where G is the Universal gravitational constant. (Recall that gravitational potentials are negative to indicate that orbiting bodies are bound — energy must be added to escape the gravitational potential well.) Use the definition of the gradient in 3-D cartesian coordinates to explore ∇V . In what direction does the gradient point, and what is its magnitude? 9

2 Integration Evaluating an integral is often described as “finding the area under a curve,” but what does this mean? Take a single-variate function, f(x), and imagine a rectangle that extends from x to x + δx along the horizontal, and from the horizontal axis (i.e., f = 0) to f(x) in the vertical direction. The area of this rectangle is given simply by the length of its base along the horizontal multiplied by its height, or f(x)δx, as depicted in Figure2. The purpose of an integral is to simply sum these infinitesimal areas across a range of x-values (for a definite integral) or across an undefined range (for a indefinite integral). Exercise Consider the function f(x) = x. Using the “area under the curve” definition of an integral, R x evaluate 0 f(x) dx.

Integrals are also the opposite counterpart to a derivative. Specifically, integrals have the property, Z b df dx = f(b) − f(a) , (27) a dx and Z df dx = f(x) + C, (28) dx where C is a constant. Why the constant? You will see this whenever an indefinite integral is presented. Imagine taking the derivative of the above. The left-hand side yields 0 d 0 f (x), and the right-hand side yields dx [f(x) + C] = f (x) + 0. So, the two sides of the original equation are identical to within a factor that is independent of x. 2.1 Commonly Used Integrals As with derivatives, in practice no one uses the area definition of integration when evaluating an integral. (Well, except when evaluating non-analytic integrals numerically.) Instead, we remember a handful of commonly-used integrals, and use integral tables when we get stumped. Some key integrals to have handy are: Z 1 xn dx = xn+1 + C (n 6= −1) , (29) n + 1

Z 1 dx = ln|x| + C, (30) x Z sin x dx = − cos x + C, (31) 10

Z cos x dx = sin x + C, (32)

Z tan x dx = ln|sec x| + C, (33)

Z cot x dx = ln|sin x| + C, (34)

Z sec x dx = ln|sec x + tan x| + C, (35)

Z ex = ex + C, (36)

Z ax ax = + C, (37) ln a Z ln x = x ln x − 1 + C, (38)

Z xex = (x − 1)ex + C. (39)

2.2 Properties of Integrals A few rules of integration and properties of integrals are important to remember. First, constants can be factored out of an integral, with, Z Z af(x) dx = a f(x) dx . (40)

Second, integration is distributive, following, Z Z Z f(x) + g(x) dx = f(x) dx + g(x) dx . (41)

Also, a trio of useful properties of definite integrals are: Z a f(x) dx = 0 , (42) a Z b Z a f(x) dx = − f(x) dx , (43) a b and Z b Z c Z b f(x) dx = f(x) dx + f(x) dx . (44) a a c 11

2.3 Techniques for Solving Integrals Two important techniques exist for converting complex integrals into something (hopefully) a bit more manageable. The first of these techniques is called substitution. Here, you are investigating the integral of f(x) and notice that the integral would be in a much more agreeable form if you made the substitution u = u(x). The substitution rule then states,

Z b Z u(b) −1  du  f(x) dx = f(u) dx du . (45) a u(a)

(Note that this rule works identically for indefinite integrals.) Put another way, if you notice R b  0 an integral happens to have the form a f g(x) g (x) dx, for some arbitrary g(x), then the substitution u = g(x) yields,

Z b Z g(b) f g(x) g0(x) dx = f(u) du . (46) a g(a)

Exercise Use substitution to evaluate R e2x dx.

Exercise Evaluate R dx/ a2 − x2. Hint: Consider the substitution u = (x + a)/(x − a).

The second key technique is called , which states, Z Z f dg = fg − g df , (47)

or, for the definite case, Z b b Z b

f dg = fg − g df , (48) a a a Application of integration by parts usually begins with noticing that your original integrand can be separated into two parts, one relatively simple and the other with a known integral. For the first part (f in the equations above), one evaluates its derivative. For the second part (dg in the equations above), you simply evaluate g = R dg. 12

Exercise Use integration by parts to evaluate R tan x/ cos x dx. Hint: Consider using f = sin x and take advantage of the useful derivatives provided above.

2.4 Multi-Dimensional Integrals In certain physical applications we sometimes need to sum a quantity over a given area (e.g., the surface of a planet) or over a given volume (e.g., the interior of a planet). Here, the concept of “area under a curve” extends straightforwardly to quantities summed over area or volume. In Cartesian coordinates, integrating a function over area or volume, respectively, takes the form, Z Z Z f dA = dy f(x, y) dx , (49) A and Z Z Z Z f dV = dz dy f(x, y, z) dx . (50) V In polar (r, φ) and spherical (r, φ, θ) coordinates these become, Z Z Z f dA = dφ f(r, φ)r dr , (51) A and Z Z Z Z f dV = dcos θ dφ f(r, φ, θ)r2 dr . (52) V Here, for example, on a globe the angle φ would correspond to a measure of longitude (i.e., the azimuthal angle) while θ corresponds to a measure of latitude (i.e., the polar angle).

Exercise   Imagine a world whose atmospheric mass density follows ρ(r) = ρ0 exp −(r − Rp)/H , where ρ0 is the density at the surface, where r = Rp, and H is a scale height. What is the total mass of the atmosphere?

3 Differential Equations A differential equation, in short, relates some function to its derivatives. (An integro- differential equations relates some function to its derivatives and integrals.) Differential equations are ubiquitous across physics, astronomy, and planetary science. Do you want solutions to motions in a gravitational potential? Well, Newton’s laws of motion will relate 13

how the momentum of a mass will change in time to its location within the potential — so, the second derivative of position is related to position. Are you interested in how heat diffuses or convects through a body or atmosphere? These physics can be described with a differential equation. Below, we emphasize two broad categories of differential equations: ordinary and par- tial. For both categories, note that differential equations have an order which refers to the highest-ordered derivative that appears within the expression. So, a differential equation that contains a second derivative of an unknown function is called “second order.” Dif- ferential equations (either ordinary or partial) are linear as long as the unknown function and its derivatives are not raised to any power above unity (i.e., are linear in these quan- tities). Finally, and in most practices, solutions to differential equations include (at least) one unknown constant. Here, boundary conditions (or initial values) are used to determine this/these constant(s). 3.1 Ordinary Differential Equations A differential equation is said to be ordinary when the expressions contain an unknown function of only a single variable. For example, dy = t , (53) dt is an example of an ordinary differential equation, where y(t) is the unknown function to be solved for. Ordinary differential equations can be non-linear, but cannot be partial differential equations. A straightforward category of ordinary differential equations to solve are so-called in- tegration equations, who receive this label because they can be solved via straightforward integration. Integration equations take the form, dy = f(t) . (54) dt Integration equations can be solved by, effectively, moving the dt to the right-hand side. (True mathematicians hate discussion along these lines because derivatives cannot, formally, be separated into their numerator and denominator.) So, the solution to the general inte- gration equation above is, Z Z y(t) + C = dy = f(t) dt . (55) 14

Exercise Solve dy/dt = t, which is an integration equation.

A second category of ordinary differential equation with straightforward solutions are autonomous equations. These have the form, dy = g(y) . (56) dt Note that not all “autonomous” differential equations actually have time as the independent variable. Again, via re-arrangement, we have the solution, Z 1 Z dy = dt = t + C. (57) g(y)

Exercise The radiative transfer equation in an absorbing, non-scattering, non-emitting medium is an “autonomous” ordinary differential equation given by dI/dτ = −I, where I is the beam intensity and τ is optical depth. Solve this differential equation, and comment on what the constant represents in the solution.

Separable ordinary differential equations can have the dependent and independent vari- ables separated onto either side of the expression. These generally have the form, dy = f(t)g(y) . (58) dt Solutions are relatively straightforward, as we can simply re-arrange to find, Z 1 Z dy = f(t) dt . (59) g(y)

Note that both integration and autonomous differential equations are sub-sets of separable equations. In my experience (at least), most differential equations in astronomy and plane- tary science are not so conveniently structured as to be separable. Nevertheless, separable equations provide an avenue to study the asymptotic behavior of systems (e.g., as t → ∞). 15

Exercise   Explore the solution of dy/dt = t/ (t0 + t) cos y at large times.

3.2 First-Order Linear Ordinary Differential Equations Many differential equations in astronomy and planetary science (and physics) involve a single derivative, a single independent variable, and are linear in these quantities. These so-called first-order linear ordinary differential equations (which include many of the cases above) have well-studied solutions. Starting with the form, dy + P (t)y = Q(t) , (60) dt we can find solutions by first multiplying by an “integrating factor” of the form,

R v(t) = e P (t) dt . (61)

This gives, dy v + vP (t)y = vQ(t) , (62) dt where the left-hand side is simply equal to dvy/dt. We then have,

dvy = vQ(t) , (63) dt which has the solution, Z vy = v(t0)Q(t0) dt0 + C, (64) or, 1 Z C y(t) = v(t0)Q(t0) dt0 + , (65) v(t) v(t) where the prime symbol is meant to indicate the difference between the independent variable (t) and the integration variable (t0).

Exercise dy R  Show that dvy/dt = v dt + vP (t)y, where v(t) = exp P (t) dt . 16

Exercise The radiative transfer equation in an absorbing, non-scattering, emitting medium has the form, dI/dτ = −I + B(τ), where B(τ) is the Planck function at location τ along the path. Solve this first-order linear ordinary differential equation. Again, comment on the role of the integration constant.

3.3 Partial Differential Equations Partial differential equations are a category of differential equation which seeks solutions to expressions that relate unknown multivariate functions and their partial derivatives. For example, heat diffusing through a one-dimensional medium can be described by a partial differential equation, where the dependent variable (temperature, T ) is related to the inde- pendent variables (position, x, and time, t) via,

∂ ∂2T T (x, t) = α , (66) ∂t ∂x2 where α is the thermal diffusivity of the medium (which can, in general, depend on both position and temperature). Real physical systems are complicated and, as a result, the partial differential equations that describe the evolution of these systems are also quite complicated. Thus, and except- ing cases where systems have been highly idealized (as is often the case in, e.g., textbook demonstrations and homework questions), analytic solutions for the dependent quantities seldom exist. In practice, then, solutions to partial differential equations are most often ex- plored via numerical analysis. Boundary and initial conditions are specified, and numerical applications of the partial differential equations that describe the system then enable us to study time evolution.

Exercise Qualitatively describe how you would numerically investigate the time-evolution of a one- dimensional system undergoing heat diffusion.

4 A Practicality One extremely useful (and powerful) tool for finding derivatives, performing integrals, and solving differential equations is Wolfram Alpha. Many people — myself included! — turn to this tool when stumped. That said, please do not take everything given to you by Wol- fram Alpha as truth. Check the results from Wolfram Alpha by comparing against online 17 derivative or integral tables, or use the techniques described above to check the results. I have had Wolfram Alpha tell me functions are not integrable when they actual are (through a substitution trick) and I have also had Wolfram Alpha return useless answers to me when it fails to interpret my input correctly. You don’t want to be the person who has to retract a paper because you blindly trusted a website!

Figure 1: Schematic showing the limit definition of a derivative.

Figure 2: Schematic showing the “area under a curve” definition of an integral.