<<

Math 3350 Supplementary Notes: Operators and Linearity

Dr. Kevin Long Texas Tech University March 3, 2008

The textbook introduces linear operators in almost an offhand way. This concept is so important for differential equations (and for applied mathematics in general) that it needs a more extensive explanation than provided by the book. The notion of a linear may seem obvious, so much so that it’s sometimes hard at first to understand why mathematicians would make such a big deal out of such a simple concept. A short list of why linearity is a big deal would include:

• Equations defined in terms of linear operators obey the principle of superposition. Superpo- sition is the foundation of two of the most powerful methods for solving practical problems: the method of Laplace transforms (which we will study in detail later in this course) and the method of Green’s functions (which you’ll see in a later course).

• Any theorem proved for equations involving linear operators applies to any type of equation involving any type of linear operator. Theorems about properties such as superposition, and methods such as Laplace transforms and Green’s functions apply not only to linear ordinary differential equations but to linear partial differential equations, linear delay equations, linear integral equations, linear integrodifferential equations, and many other kinds of problems. The details of all of these problem types are very different, but the linear structure is the same; that common structure lets us reuse ideas and methods between fields from statics to quantum mechanics.

I’ll repeat here what I told you in class: advanced mathematics is about using structure to sim- plify calculations. Linearity is an example of that concept: it is a structural property common to many types of problems; in particular, it is common to all linear ordinary differential equations. By referring to that abstract structure, we’ll be able to establish some very important theorems such as superposition and dependence of solutions on initial conditions in a very simple and general way, without going through a bunch of case-by-case calculations. These theorems then do two impor- tant practical things for us: first, they are used in developing solution methods such as Laplace transforms; second, they help us get physical insight that can be used to understand the behavior of an equation without doing complicated calculations.

1 Operators

An operator transforms one to another. For example, the operator D trans- forms any function u to its derivative u0. We write this using the following notation:

D[u] = u0.

Operator notation is very similar to function notation; we will distinguish operators from functions by using square brackets [] rather than parentheses () to enclose the argument. Operator notation is similar to function notation because the operator concept is similar to the function concept: operators and functions both take “input” and produce “output.” The difference is that a function accepts one or more numbers as input and produces one or more numbers as output, whereas an operator accepts a function as input and produces a function as output.

Examples Here are some simple examples: • The identity operator I returns the input argument unchanged:

I[u] = u.

• The derivative operator D returns the derivative of the input:

D[u] = u0.

• The zero operator Z returns zero the input:

Z[u] = 0.

Here are some other examples. • Let’s represent as an operator the expression

y00 + 2y0 + 5y.

We can write this as D[D[y]] + 2D[y] + 5I[y], where D and I are the derivative and identity operators defined previously. Most of the we won’t write the identity operator explicitly, so our operator will be

D[D[y]] + 2D[y] + 5y.

We’ll also usually use the shorthand

D2[y] ≡ D[D[y]]

2 so that the expression y00 + 2y0 + 5y is, in operator notation,

D2[y] + 2D[y] + 5y.

Important: Notice that we use “power” notation D2[y] to represent repeated composition of an operator, not repeated multiplication of the results of an operator. The expression D3[u] means D[D[D[u]]], not (D[u])3.

• We can represent the entire operation D2[y] + 2D[y] + 5y as an operator, by writing

A = D2 + 2D + 5I

so that A[y] = D2[y] + 2D[y] + 5y.

Linear Operators

In this course we will concentrate on an important subset of operators: those which are linear.

Definition 0.1 Let u and v be arbitrary functions of a variable t, and let α be an arbitrary complex constant. A linear operator is any operator L having both of the following properties:

1. Distributivity over : L[u + v] = L[u] + L[v]

2. Commutativity with multiplication by a constant:

αL[u] = L[αu]

Examples 1. The derivative operator D is a linear operator. To prove this, we simply check that D has both properties required for an operator to be linear.

• D distributes over addition. By the definition of D, d D[u + v] = (u + v) dt and then by the sum rule for , d du dv (u + v) = + . dt dt dt Putting the previous steps together demonstrates that D[u + v] = D[u] + D[v], so that D has property 1, distribution over addition.

3 • D commutes with multiplication by a constant. By the definition of D, d D[αu] = (αu) dt which in turn equals (by the product rule) du α , dt or αD[u]. Because D[αu] = αD[u], D has property 2. Because D has both properties required of a linear operator, it is linear. 2. If A and B are linear operators, their sum A + B is a linear operator. The sum of two operators is defined so that the sum of the operators returns the sum of their outputs: (A + B)[u] ≡ A[u] + B[u].

We prove linearity by checking that A+B has the two properties required of a linear operator. • A + B distributes over addition. From the definition of A + B we know (A + B)[u + v] = A[u + v] + B[u + v].

We have specified that A and B are linear, so (A + B)[u + v] = A[u] + A[v] + B[u] + B[v],

which after reordering the sum gives (A + B)[u + v] = A[u] + B[u] + A[v] + B[v]

or (A + B)[u + v] = (A + B)[u] + (A + B)[v] which means A + B distributes over addition. A + B therefore has property 1. • A + B commutes with multiplication by a constant. We have (A + B)[αu] = A[αu] + B[αu]

and then because A and B are assumed to be linear, (A + B)[αu] = αA[u] + αB[u].

Factoring out α on the right-hand side gives (A + B)[αu] = α(A[u] + B[u])

or (A + B)[αu] = α(A + B)[u]

4 3. Let’s next see an example of an operator that is not linear. Define the exponential operator E[u] = eu. We test the two properties required for linearity: • Property 1: E[u + v] = e(u + v) = E[u]E[v] which is not equal to E[u] + E[v]. Therefore the exponential operator does not have the first property required for linearity, and is therefore not a linear operator. • Property 2: E[αu] = eαu = (eu)α = E[u]α which is not equal to αE[u]. Therefore the exponential operator does not have the second property required for linearity, and is therefore not a linear operator. 4. Here’s another nonlinear operator: Z x 1/3 A[u] = (u(t))3 dt . 0 We’ll find that this operator has property 2 (commutativity with multiplication by a constant) but not property 1 (distributivity over addition). • Property 1: Z x 1/3 A[u + v] = (u(t) + v(t))3 dt 0 Z x 1/3 = (u(t))3 + 3u(t)(v(t))2 + 3v(t)(u(t))2 + (v(t))3 + dt 0 which is not equal to Z x 1/3 Z x 1/3 A[u] + A[v] = (u(t))3 dt + (v(t))3 dt 0 0 Therefore A does not have property 1. • Property 2: Z x 1/3 A[αu] = (αu(t))3 dt 0  Z x 1/3 = α3 (u(t))3 dt 0 Z x 1/3 = α (u(t))3 dt 0 = αA[u]. Therefore A has property 2. The operator A has property 2 but because it must have both properties required for linearity, it is not a linear operator.

5 Worked Examples

Here are some worked examples relating to problems in basic statics. Imagine a 3D object V , with fixed boundary surface and known density distribution function ρ(x). In statics, we would need to compute a number of properties of this body, such as its total and center of mass. All of these computations can be expressed in terms of operators acting on the density function ρ. These operators will be integral operators, meaning that they each involve the calculation of an integral involving the density function ρ. Later in the semester we’ll see another integral operator useful in solving differential equations: the .

Operators for mass and center of mass The mass M of an object V is computed by integrating the density over volume: Z M = ρ(x) dV. V The integral is a volume integral, which you will remember from calculus 3; if you’re not good with volume integrals, don’t worry, we won’t do any actual computations with them here. We’re just interested in the structure of the operations. We can regard the computation of mass as an operator acting on the density function ρ: it takes ρ as an input, and produces a constant function M as output. In operator notation, we write Z M[ρ] = ρ dV. V Notice that this is a little bit different from our previous examples, in that the output is a number – the mass – rather than a function. If it makes you happier, you can think of the mass M as a constant-valued function; it then becomes similar to the zero operator, which returned a function whose value has the constant value zero everywhere. Is the mass operator linear? Let’s check, by testing the two properties required for linearity.

• Property 1: distribution over addition. Z M[ρ1 + ρ2] = (ρ1 + ρ2) dV V which by the rule for integrals of sums is Z Z M[ρ1 + ρ2] = ρ1 dV + ρ2 dV V V or M[ρ1 + ρ2] = M[ρ1] + M[ρ2]. Therefore, the mass operator has property 1.

6 • Property 2: Commutation with multiplication by a constant. Z M[αρ] = αρ dV, V and because we can always pull a constant out of an integral, Z M[αρ] = α ρ dV = αM[ρ]. V Therefore, the mass operator has property 2. Having both properties required of a linear operator, the mass operator is linear. At this point is might appear that all useful operators are linear. Here’s one that’s not: the operator that computes the center of mass given a density distribution. Recalling from your course in statics or the definition of a center of mass, we define the operator R V xρ dV X[ρ] = R V ρ dV which, given a density function ρ, computes the center of mass position vector X. Notice that the denominator is the mass operator M[ρ]. The center of mass operator X is not linear. Let’s check the two linearity properties: • Property 1: R V x (ρ1 + ρ2) dV X[ρ1 + ρ2] = M[ρ1] + M[ρ2] which simplifies to M[ρ1]X[ρ1] + M[ρ2]X[ρ2] X[ρ1 + ρ2] = . M[ρ1] + M[ρ2]

There is no way to rewrite the right-hand side as X[ρ1] + X[ρ2], so the operator is not linear. • Having shown that X does not have property 1, it’s not necessary to check property 2: to be linear, it must have both properties. However, because this is a worked example problem, I’ll work out property 2 as well. We have R xαρ dV X[αρ] = V . M[αρ] After pulling α out of the top integral and using the linearity of M in the denominator, we get α R xρ dV X[αρ] = V . αM[ρ] The α’s cancel, leaving us with X[αρ] = X[ρ]. Therefore, X does not have property 2 either.

7 So the mass operator M is linear but the center of mass operator X is not. We proved this mathe- matically using the definition of a linear operator; let’s now step back and check that this conclusion makes engineering sense. What does linearity mean in this context? The distributivity of the mass operator over addition means that if I add two densities, the total mass is the sum of the individual . When mixing copper and tin to make bronze alloy, we know experimentally that the mass of the bronze product is equal to the mass of the copper input plus the mass of the tin input. We also know that if in an object of fixed size we replace one material by another α times as dense, the total mass of the object increases by α. Linearity of the mass operator is the mathematical description of these physical facts. What about the nonlinearity of the center of mass operator? Imagine a solid, constant-density object; say, a steel block. Make another block of identical size and shape, but make it out of foam a hundred times less dense than steel. The mass will be less, but the center of mass position is unchanged: you could balance the foam block and the steel block at exactly the same point. For center of mass, what matters is the distribution of mass, not the total mass. Masses scale linearly, but center of mass positions don’t. In these examples from statics we see operators that are rather different from what we use in differential equations, but they illustrate in a familiar context the difference between linear and nonlinear operators. Linear operators represent properties (such as mass) that are additive and scale proportionally. Nonlinear operators represent those properties (such as center of mass) that don’t combine by simple addition and don’t scale proportionally.

Exercises

1. Prove that if L is a linear operator, then L[αu + βv] = αL[u] + βL[v]

2. If A and B are linear operators, prove that their composition A[B[u]] is a linear operator. 3. Prove that the sum of a linear operator and a nonlinear operator is nonlinear. 4. Prove that the composition of a linear operator and a nonlinear operator is nonlinear. 5. Define the operator A[u] = u + 1. Prove that A is not a linear operator. 6. Define the operator B[u] = u2. Prove that B is not a linear operator. 7. Define the operator A[u] = D2[u] + 3u. Show that A[u] is a linear operator.

8 8. In rotational dynamics, you will encounter the moment of inertia Z 2 2 Iz[ρ] = (x + y )ρ dV V where ρ is the mass density, and also the radius of gyration

s I [ρ] Γ[ρ] = z M[ρ]

where M[ρ] is the mass operator defined above. Are Iz and Γ linear operators?

9