CALCULUS OF VARIATIONS AND OPTIMAL CONTROL FOR GENERALIZED FUNCTIONS
GAST¨I¿ŒO S.F. FREDERICO, PAOLO GIORDANO, ALEXANDR A. BRYZGALOV, AND MATHEUS J. LAZO
Abstract. We present an extension of some results of higher order calculus of variations and optimal control to generalized functions. The framework is the category of generalized smooth functions, which includes Schwartz distributions, while sharing many nonlinear prop- erties with ordinary smooth functions. We prove the higher order Euler-Lagrange equations, the D’Alembert principle in differential form, the du Bois-Reymond optimality condition and the Noether’s theorem. We start the theory of optimal control proving a weak form of the Pontryagin maximum principle and the Noether’s theorem for optimal control. We close with a study of a singularly variable length pendulum, oscillations damped by two media and the Pais–Uhlenbeck oscillator with singular frequencies.
Contents 1. Introduction: singular Lagrangian mechanics1 2. Basic notions 3 2.1. Open, closed and bounded sets generated by nets5 2.2. Generalized smooth functions and their calculus5 2.3. Embedding of Sobolev-Schwartz distributions and Colombeau generalized functions8 2.4. Functionally compact sets and multidimensional integration 10 3. Higher-order calculus of variations for generalized functions 13 3.1. Higher-order Euler–Lagrange equation, D’Alembert principle and du Bois–Reymond optimality condition 15 3.2. Higher-order Noether’s theorem 19 4. Optimal control problems 22 4.1. Weak Pontryagin Maximum Principle 24 4.2. Noether’s Principle for optimal control 29 5. Examples and applications 31 5.1. Modeling singular dynamical systems 31 5.2. How to use numerical solutions 31 5.3. Singularly variable length pendulum 32 5.4. Oscillations damped by two media 36 arXiv:2011.09660v1 [math.FA] 18 Nov 2020 5.5. Pais–Uhlenbeck oscillator with singular frequencies 38 6. Conclusions 39 References 40
1. Introduction: singular Lagrangian mechanics Leibniz’s axiom Natura non facit saltus seems to be inconsistent not only at the atomic level, but also for several macroscopic phenomena. Even the simple motion of an elastic bouncing
2020 Mathematics Subject Classification. 49-XX, 46Fxx, 46F30. Key words and phrases. Calculus of variations, optimal control, Schwartz distributions, generalized functions for nonlinear analysis. M.J. Lazo has been supported by CNPq and FAPERGS. P. Giordano has been supported by grants P30407, P33538 of the Austrian Science Fund FWF. 1 2 GAST¨I¿ŒO S.F. FREDERICO, PAOLO GIORDANO, ALEXANDR A. BRYZGALOV, AND MATHEUS J. LAZO ball seems to be more easily modeled using non-differentiable functions than classical C2 ones, at least if we are not interested to model the non-trivial behavior at the collision times. Clearly, we can always use smooth functions to approximately model the forces acting on the ball, but this would introduce additional parameters (see e.g. [58, 65] for smooth approximations of steep potentials in the billiard problem). Therefore, the motivation to introduce a suitable kind of generalized functions formalism in classical mechanics is clear: is it possible, e.g., to generalize Lagrangian mechanics so that the Lagrangian can be presented as a generalized function? This would undoubtedly be of an applicable advantage, since many relevant systems are described by singular Lagrangians: non-smooth constraints, collisions between two or more bodies, motion in different or in granular media, discontinuous propagation of rays of light, even turning on the switch of an electrical circuit, to name but a few, and only in the framework of classical physics. Indeed, this type of problems is widely studied (see e.g. [4,28,34,37,42–44,62,64]), but sometimes the presented solutions are not general or hold only for special conditions and particular potentials. From the purely mathematical point of view, an enlarged space of solutions (with distributional derivatives) for variational problems leads to the the so-called Lavrentiev phenomenon, see [5,27]. Variational problems with singular solutions (e.g. non-differentiable on a dense set, or with infinite derivatives at some points) are also studied, see [10, 11, 27, 61]. In this sense, the fact that since J.D. Marsden’s works [49–51] there has not been many further attempt to use Sobolev-Schwartz distributions for the description of Hamiltonian mechanics, can be considered as a clue that the classical distributional framework is not well suited to face this problem in general terms (see also [56] for a similar approach with a greater focus on the gener- alization in geometry). In [49–51], J.D. Marsden introduces distributions on manifolds based on flows. Since the traditional system of Hamilton’s equations breaks down, Marsden considers the flow as a limit of smooth ones. On the other hand, Kunzinger et al. in [36] (using Colombeau generalized functions) critically analyses the regularization approach put forward by Marsden and constructed a counter example for the main flow theorems of [49]. In this field, a number of prob- lems remains open, see again [49]. For example, in the non-smooth case the variational theorems fail, and the study of the virial equation for hard spheres in a box is still an open question. All this motivates the use of different spaces of nonlinear generalized functions in mathematical physics and in singular calculus of variations, see e.g. [2,8, 25, 29, 34, 66]. In this paper, we introduce an approach to variational problems involving singularities that allows the extension of the classical theory with very natural statements and proofs. We are interested in extremizing functionals which are either distributional themselves or whose set of extremals includes generalized functions. Clearly, distribution theory, being a linear theory, has certain difficulties when nonlinear problems are in play. To overcome this type of problems, we are going to use the category of generalized smooth functions (GSF), see [20–23, 41]. This theory seems to be a good candidate, since it is an exten- sion of classical distribution theory which allows to model nonlinear singular problems, while at the same time sharing many nonlinear properties with ordinary smooth functions, like the closure with respect to composition and several non trivial classical theorems of the calculus. One could describe GSF as a methodological restoration of Cauchy-Dirac’s original conception of generalized function, see [12,33,39]. In essence, the idea of Cauchy and Dirac (but also of Poisson, Kirchhoff, Helmholtz, Kelvin and Heaviside) was to view generalized functions as suitable types of smooth set-theoretical maps obtained from ordinary smooth maps depending on suitable infinitesimal or infinite parameters. For example, the density of a Cauchy-Lorentz distribution with an infini- tesimal scale parameter was used by Cauchy to obtain classical properties which nowadays are attributed to the Dirac delta, cf. [33]. In the present work, the foundation of the calculus of variations is set for functionals defined by arbitrary GSF. This in particular applies to any Schwartz distribution and any Colombeau generalized function (see e.g. [7, 29]). The main aim of the paper is to start the higher-order calculus of variations and the theory of optimal control for GSF. The structure of the paper is as follows. We start with an introduction into the setting of GSF and give basic notions concerning GSF and their calculus that are needed for the calculus of variations (Sec.2). After the basic definitions to set the problem in the framework of CALCULUS OF VARIATIONS AND OPTIMAL CONTROL FOR GF 3
GSF, we prove the higher order Euler-Lagrange equations, the D’Alembert principle in differential form, the du Bois-Reymond optimality condition and the Noether’s theorem (Sec.3). In Sec.4, we start the theory of optimal control proving a weak form of the Pontryagin maximum principle and the Noether’s theorem, and in Sec.5 we close with a study of a singularly variable length pendulum, oscillations damped by two media and the Pais–Uhlenbeck oscillator with singular frequencies. The paper is self-contained, in the sense that it contains all the statements required for the proofs of calculus of variations we are going to present. If proofs of preliminaries are omitted, we clearly give references to where they can be found. Therefore, to understand this paper, only a basic knowledge of distribution theory is needed.
2. Basic notions The new ring of scalars. In this work, I denotes the interval (0, 1] ⊆ R and we will always use I the variable ε for elements of I; we also denote ε-dependent nets x ∈ R simply by (xε). By N we denote the set of natural numbers, including zero. We start by defining a new simple non-Archimedean ring of scalars that extends the real field R. The entire theory is constructive to a high degree, e.g. neither ultrafilter nor non-standard method are used. For all the proofs of results in this section, see [20–23].
I + Definition 1. Let ρ = (ρε) ∈ (0, 1] be a non-decreasing net such that (ρε) → 0 as ε → 0 (in the following, such a net will be called a gauge), then −a (i) I(ρ) := {(ρε ) | a ∈ R>0} is called the asymptotic gauge generated by ρ. 0 (ii) If P(ε) is a property of ε ∈ I, we use the notation ∀ ε : P(ε) to denote ∃ε0 ∈ I ∀ε ∈ (0, ε0]: P(ε). We can read ∀0ε as for ε small. I (iii) We say that a net (xε) ∈ R is ρ-moderate, and we write (xε) ∈ Rρ if + ∃(Jε) ∈ I(ρ): xε = O(Jε) as ε → 0 , i.e., if 0 −N ∃N ∈ N ∀ ε : |xε| ≤ ρε . I (iv) Let (xε), (yε) ∈ R , then we say that (xε) ∼ρ (yε) if −1 + ∀(Jε) ∈ I(ρ): xε = yε + O(Jε ) as ε → 0 , that is if 0 n ∀n ∈ N ∀ ε : |xε − yε| ≤ ρε . (2.1) This is a congruence relation on the ring Rρ of moderate nets with respect to pointwise operations, and we can hence define
ρ Re := Rρ/ ∼ρ, which we call Robinson-Colombeau ring of generalized numbers. This name is justified by [6, 59]: Indeed, in [59] A. Robinson introduced the notion of moderate and negligible nets depending on an arbitrary fixed infinitesimal ρ (in the framework of nonstandard analysis); independently, J.F. Colombeau, cf. e.g. [6] and references therein, studied the same concepts without using nonstandard analysis, but considering only the particular gauge ρε = ε.
ρ I We can also define an order relation on Re by saying that [xε] ≤ [yε] if there exists (zε) ∈ R such that (zε) ∼ρ 0 (we then say that (zε) is ρ-negligible) and xε ≤ yε + zε for ε small. Equivalently, we have that x ≤ y if and only if there exist representatives [xε] = x and [yε] = y such that xε ≤ yε for all ε. Although the order ≤ is not total, we still have the possibility to define the infimum [xε] ∧ [yε] := [min(xε, yε)], the supremum [xε] ∨ [yε] := [max(xε, yε)] of a finite number ρ of generalized numbers. Henceforth, we will also use the customary notation Re∗ for the set of ρ invertible generalized numbers, and we write x < y to say that x ≤ y and x − y ∈ Re∗. Our ρ notations for intervals are: [a, b] := {x ∈ Re | a ≤ x ≤ b},[a, b]R := [a, b] ∩ R, and analogously for ρ n n n segments [x, y] := {x + r · (y − x) | r ∈ [0, 1]} ⊆ Re and [x, y]R = [x, y] ∩ R . As in every non-Archimedean ring, we have the following 4 GAST¨I¿ŒO S.F. FREDERICO, PAOLO GIORDANO, ALEXANDR A. BRYZGALOV, AND MATHEUS J. LAZO
ρ Definition 2. Let x ∈ Re be a generalized number, then (i) x is infinitesimal if |x| ≤ r for all r ∈ R>0. If x = [xε], this is equivalent to limε→0+ xε = 0. We write x ≈ y if x − y is infinitesimal. (ii) x is infinite if |x| ≥ r for all r ∈ R>0. If x = [xε], this is equivalent to limε→0+ |xε| = +∞. (iii) x is finite if |x| ≤ r for some r ∈ R>0. ρ n ρ For example, setting dρ := [ρε] ∈ Re, we have that dρ ∈ Re, n ∈ N>0, is an invertible infinitesimal, −n −n whose reciprocal is dρ = [ρε ], which is necessarily a positive infinite number. Of course, in ρ the ring Re there exist generalized numbers which are not in any of the three classes of Def.2, like 1 1 e.g. xε = ε sin ε . The following result is useful to deal with positive and invertible generalized numbers. For its proof, see e.g. [29].
ρ Lemma 3. Let x ∈ Re. Then the following are equivalent: (i) x is invertible and x ≥ 0, i.e. x > 0. 0 (ii) For each representative (xε) ∈ Rρ of x we have ∀ ε : xε > 0. 0 m (iii) For each representative (xε) ∈ Rρ of x we have ∃m ∈ N ∀ ε : xε > ρε . 0 m (iv) There exists a representative (xε) ∈ Rρ of x such that ∃m ∈ N ∀ ε : xε > ρε . ρ ρ ρ Topologies on Ren. On the Re-module Ren we can consider the natural extension of the Euclidean ρ ρ n ρ norm, i.e. |[xε]| := [|xε|] ∈ Re, where [xε] ∈ Re . Even if this generalized norm takes values in Re, it shares some essential properties with classical norms: |x| = x ∨ (−x) |x| ≥ 0 |x| = 0 ⇒ x = 0 |y · x| = |y| · |x| |x + y| ≤ |x| + |y| ||x| − |y|| ≤ |x − y|.
ρ It is therefore natural to consider on Ren a topology generated by balls defined by this generalized ρ norm and the set of radii Re>0 of positive invertible numbers: ρ ρ Definition 4. Let c ∈ Ren and x, y ∈ Re, then: n ρ n o ρ (i) Br(c) := x ∈ Re | |x − c| < r for each r ∈ Re>0. E n n (ii) Br (c) := {x ∈ R | |x − c| < r}, for each r ∈ R>0, denotes an ordinary Euclidean ball in R if c ∈ Rn. The relation < has better topological properties as compared to the usual strict order relation n ρ ρ no a ≤ b and a 6= b (that we will never use) because the set of balls Br(c) | r ∈ Re>0, c ∈ Re ρ is a base for a sequentially Cauchy complete topology on Ren called sharp topology. We will call sharply open set any open set in the sharp topology. The existence of infinitesimal neighborhoods (e.g. r = dρ) implies that the sharp topology induces the discrete topology on R. This is a necessary result when one has to deal with continuous generalized functions which have infinite 0 derivatives. In fact, if f (x0) is infinite, we have f(x) ≈ f(x0) only for x ≈ x0 , see [21]. Also ρ open intervals are defined using the relation <, i.e. (a, b) := {x ∈ Re | a < x < b}. Note that by ρ m Lem.3.(iii), for all r ∈ Re>0 there exists m ∈ N such that r ≥ dρ . Therefore, also the set of n ρ no balls Bdρm (c) | m ∈ N, c ∈ Re is a base for the sharp topology. ρ The lecturer can feel uneasy in considering a ring of scalar such as Re instead of a field. On the other hand, as mentioned above, we will see that all our generalized functions are continuous in the sharp topology. Therefore, the following result is partly reassuring:
ρ Lemma 5. Invertible elements of Re are dense in the sharp topology, i.e. ρ ρ ∀h ∈ Re ∀δ ∈ Re>0 ∃k ∈ (h − δ, h + δ): k is invertible. CALCULUS OF VARIATIONS AND OPTIMAL CONTROL FOR GF 5
2.1. Open, closed and bounded sets generated by nets. A natural way to obtain sharply ρ n n open, closed and bounded sets in Re is by using a net (Aε) of subsets Aε ⊆ R . We have two ρ n ways of extending the membership relation xε ∈ Aε to generalized points [xε] ∈ Re (cf. [22, 54]): n Definition 6. Let (Aε) be a net of subsets of R , then n ρ n 0 o (i)[ Aε] := [xε] ∈ Re | ∀ ε : xε ∈ Aε is called the internal set generated by the net (Aε). n (ii) Let (xε) be a net of points of R , then we say that xε ∈ε Aε, and we read it as (xε) strongly belongs to (Aε), if 0 (i) ∀ ε : xε ∈ Aε. 0 0 (ii) If (xε) ∼ρ (xε), then also xε ∈ Aε for ε small. n ρ n o Moreover, we set hAεi := [xε] ∈ Re | xε ∈ε Aε , and we call it the strongly internal set generated by the net (Aε). ρ (iii) We say that the internal set K = [Aε] is sharply bounded if there exists M ∈ Re>0 such that K ⊆ BM (0). (iv) Finally, we say that the (Aε) is a sharply bounded net if there exists N ∈ R>0 such that 0 −N ∀ ε ∀x ∈ Aε : |x| ≤ ρε .
Therefore, x ∈ [Aε] if there exists a representative [xε] = x such that xε ∈ Aε for ε small, whereas this membership is independent from the chosen representative in case of strongly internal sets. n An internal set generated by a constant net Aε = A ⊆ R will simply be denoted by [A]. The following theorem (cf. [22,23,54]) shows that internal and strongly internal sets have dual topological properties:
n n Theorem 7. For ε ∈ I, let Aε ⊆ R and let xε ∈ R . Then we have 0 q (i) [xε] ∈ [Aε] if and only if ∀q ∈ R>0 ∀ ε : d(xε,Aε) ≤ ρε. Therefore [xε] ∈ [Aε] if and only if ρ [d(xε,Aε)] = 0 ∈ Re. 0 c q c n (ii) [xε] ∈ hAεi if and only if ∃q ∈ R>0 ∀ ε : d(xε,Aε) > ρε, where Aε := R \ Aε. Therefore, c c if (d(xε,Aε)) ∈ Rρ, then [xε] ∈ hAεi if and only if [d(xε,Aε)] > 0. (iii) [Aε] is sharply closed. (iv) hAεi is sharply open. n (v) [Aε] = [cl (Aε)], where cl (S) is the closure of S ⊆ R . n (vi) hAεi = hint(Aε)i, where int (S) is the interior of S ⊆ R . For example, it is not hard to show that the closure in the sharp topology of a ball of center c = [cε] and radius r = [rε] > 0 is n o h i B (c) = x ∈ ρ d | |x − c| ≤ r = BE (c ) , (2.2) r Re rε ε whereas n o B (c) = x ∈ ρ d | |x − c| < r = hBE (c )i. r Re rε ε
ρ 2.2. Generalized smooth functions and their calculus. Using the ring Re, it is easy to consider a Gaussian with an infinitesimal standard deviation. If we denote this probability density ρ by f(x, σ), and if we set σ = [σε] ∈ Re>0, where σ ≈ 0, we obtain the net of smooth functions (f(−, σε))ε∈I . This is the basic idea we are going to develop in the following
ρ ρ Definition 8. Let X ⊆ Ren and Y ⊆ Red be arbitrary subsets of generalized points. Then we say that f : X −→ Y is a generalized smooth function ∞ d if there exists a net fε ∈ C (Ωε, R ) defining f in the sense that (i) X ⊆ hΩεi, (ii) f([xε]) = [fε(xε)] ∈ Y for all x = [xε] ∈ X, α d n (iii)( ∂ fε(xε)) ∈ Rρ for all x = [xε] ∈ X and all α ∈ N . The space of generalized smooth functions (GSF) from X to Y is denoted by ρGC∞(X,Y ). 6 GAST¨I¿ŒO S.F. FREDERICO, PAOLO GIORDANO, ALEXANDR A. BRYZGALOV, AND MATHEUS J. LAZO
Let us note explicitly that this definition states minimal logical conditions to obtain a set- theoretical map from X into Y and defined by a net of smooth functions of which we can take arbitrary derivatives still remaining in the space of ρ-moderate nets. In particular, the following Thm.9 states that the equality f([xε]) = [fε(xε)] is meaningful, i.e. that we have independence α ρ d n from the representatives for all derivatives [xε] ∈ X 7→ [∂ fε(xε)] ∈ Re , α ∈ N . ρ n ρ d Theorem 9. Let X ⊆ Re and Y ⊆ Re be arbitrary subsets of generalized points. Let fε ∈ ∞ d C (Ωε, R ) be a net of smooth functions that defines a generalized smooth map of the type X −→ Y , then n 0 n 0 α α 0 (i) ∀α ∈ N ∀(xε), (xε) ∈ Rρ :[xε] = [xε] ∈ X ⇒ [∂ fε(xε)] = [∂ fε(xε)]. (ii) Each f ∈ ρGC∞(X,Y ) is continuous with respect to the sharp topologies induced on X, Y . ∞ n d (iii) f : X −→ Y is a GSF if and only if there exists a net vε ∈ C (R , R ) defining a generalized smooth map of type X −→ Y such that f = [vε(−)]|X . ρ (iv) GSF are closed with respect to composition, i.e. subsets S ⊆ Res with the trace of the sharp topology, and GSF as arrows form a subcategory of the category of topological spaces. We will call this category ρGC∞, the category of GSF. Therefore, with pointwise sum and product, ρ ∞ ρ any space GC (X, Re) is an algebra. Similarly, we can define generalized functions of class ρGCk, with k ≤ +∞:
ρ ρ Definition 10. Let X ⊆ Ren and Y ⊆ Red be arbitrary subsets of generalized points and k ∈ N ∪ {+∞}. Then we say that f : X −→ Y is a generalized Ck function k d if there exists a net fε ∈ C (Ωε, R ) defining f in the sense that (i) X ⊆ hΩεi, (ii) f([xε]) = [fε(xε)] ∈ Y for all x = [xε] ∈ X, α d n (iii)( ∂ fε(xε)) ∈ Rρ for all x = [xε] ∈ X and all α ∈ N such that |α| ≤ k. n 0 0 α α 0 (iv) ∀α ∈ N ∀[xε], [xε] ∈ X : |α| = k, [xε] = [xε] ⇒ [∂ fε(xε)] = [∂ fε(xε)]. n α ρ d (v) For all α ∈ N , with |α| = k, the map [xε] ∈ X 7→ [∂ fε(xε)] ∈ Re is continuous in the sharp topology. The space of generalized Ck functions from X to Y is denoted by ρGCk(X,Y ). Note that properties (iv), (v) are required only for |α| = k because for lower length they can be proved using property (iii) and the classical mean value theorem for fε (see e.g. [22, 23]). From (i) and (ii) of Thm.9 it follows that this definition of ρGCk is equivalent to Def.8 if k = +∞. Moreover, properties similar to (iii), (iv) of Thm.9 can also be proved for ρGCk. The differential calculus for ρGCk+1 maps can be introduced by showing existence and unique- ness of another ρGCk map serving as incremental ratio (sometimes this is called derivative ¨ı¿œla Carath¨ı¿œodory, see e.g. [35]).
ρ n ρ n Theorem 11. Let U ⊆ Re be a sharply open set, let v = [vε] ∈ Re , k ∈ N ∪ {+∞}, and let ρ k+1 ρ ρ k+1 k+1 f ∈ GC (U, Re) be a GC map generated by the net of functions fε ∈ C (Ωε, R). Then ρ k ρ (i) There exists a sharp neighborhood T of U × {0} and a map r ∈ GC (T, Re), called the generalized incremental ratio of f along v, such that ∀(x, h) ∈ T : f(x + hv) = f(x) + h · r(x, h). (ii) Any two generalized incremental ratios coincide on a sharp neighborhood of U × {0}, so that we can use the notation ∂f [x; h] := r(x, h) if (x, h) are sufficiently small. h ∂v i ∂f ∂fε ∂f (iii) We have [x; 0] = (xε) for every x ∈ U and we can thus define Df(x) · v := (x) := ∂v ∂vε ∂v ∂f ∂f ρ k ρ ∂v [x; 0], so that ∂v ∈ GC (U, Re). Note that this result permits us to consider the partial derivative of f with respect to an ρ arbitrary generalized vector v ∈ Ren which can be, e.g., infinitesimal or infinite. Using recursively this result, we can also define subsequent differentials Djf(x) as j−multilinear maps, and we set CALCULUS OF VARIATIONS AND OPTIMAL CONTROL FOR GF 7
j ρ ρ Djf(x)·hj := Djf(x)(h, . . .j . . . , h) if j ≤ k. The set of all the j−multilinear maps Ren −→ Red ρ j ρ n ρ d j ρ n ρ d over the ring Re will be denoted by L ( Re , Re ). For A = [Aε(−)] ∈ L ( Re , Re ), we set |A| := [|Aε|], the generalized number defined by the operator norms of the multilinear maps j n d Aε ∈ L (R , R ). The following result follows from the analogous properties for the nets of smooth functions defining f and g.
ρ ρ Theorem 12. Let U ⊆ Ren be an open subset in the sharp topology, let v ∈ Ren and f, g ∈ ρ k+1 ρ GC (U, Red). Then ∂(f+g) ∂f ∂g (i) ∂v = ∂v + ∂v ∂(r·f) ∂f ρ (ii) ∂v = r · ∂v ∀r ∈ Re ∂(f·g) ∂f ∂g (iii) ∂v = ∂v · g + f · ∂v ∂f ρ ρ ρ n (iv) For each x ∈ U, the map df(x).v := ∂v (x) ∈ Re is Re-linear in v ∈ Re ρ ρ ρ k+1 (v) Let U ⊆ Ren and V ⊆ Red be open subsets in the sharp topology and g ∈ GC (V,U), ρ k+1 ρ ρ f ∈ GC (U, Re) be generalized maps. Then for all x ∈ V and all v ∈ Red, we have ∂(f◦g) ∂g ∂v (x) = df (g(x)) . ∂v (x).
ρ ρ Note that the absolute value function | − | : Re −→ Re is not a GSF because its derivative is not sharply continuous at the origin; clearly, it is a ρGC0 function. This is a good motivation to introduce integration of ρGCk functions. One dimensional integral calculus is based on the following
ρ k ρ ρ Theorem 13. Let k ∈ N ∪ {+∞} and f ∈ GC ([a, b], Re) be defined in the interval [a, b] ⊆ Re, where a < b. Let c ∈ [a, b]. Then, there exists one and only one generalized Ck+1 map F ∈ ρ k+1 ρ GC ([a, b], Re) such that F (c) = 0 and F 0(x) = f(x) for all x ∈ [a, b]. Moreover, if f is defined k h xε i by the net fε ∈ C ( , ) and c = [cε], then F (x) = fε(s)ds for all x = [xε] ∈ [a, b]. R R cε ´ We can thus define (−) (−) Definition 14. Under the assumptions of Thm. 13, we denote by c f := c f(s) ds ∈ ρ k+1 ρ GC ([a, b], Re) the unique generalized Ck+1 map such that: ´ ´ c (i) c f = 0 0 ´ (−) d x (ii) u f (x) = dx u f(s)ds = f(x) for all x ∈ [a, b]. ´ ´ All the classical rules of integral calculus hold in this setting:
ρ k ρ ρ k ρ Theorem 15. Let f ∈ GC (U, Re) and g ∈ GC (V, Re) be generalized Ck maps defined on sharply ρ ρ open domains in Re. Let a, b ∈ Re with a < b and c, d ∈ [a, b] ⊆ U ∩ V , then d d d (i) c (f + g) = c f + c g ´ d d ´ ´ ρ (ii) c λf = λ c f ∀λ ∈ Re ´ d e ´ d (iii) c f = c f + e f for all e ∈ [a, b] ´ d ´ c ´ (iv) c f = − d f ´ d 0 ´ (v) c f = f(d) − f(c) ´ d 0 d d 0 (vi) c f · g = [f · g]c − c f · g ´ ´ b b (vii) If f(x) ≤ g(x) for all x ∈ [a, b], then a f ≤ a g. ρ ρ k+1 ρ (viii) Let a, b, c, d ∈ Re, with a < b and c <´ d, and´ f ∈ GC ([a, b] × [c, d], Red), then d b b ∂ f(τ, s) dτ = f(τ, s) dτ ∀s ∈ [c, d]. ds ˆa ˆa ∂s 8 GAST¨I¿ŒO S.F. FREDERICO, PAOLO GIORDANO, ALEXANDR A. BRYZGALOV, AND MATHEUS J. LAZO
ρ k ρ ρ k+1 Theorem 16. Let f ∈ GC (U, Re) and ϕ ∈ GC (V,U) be GSF defined on sharply open domains ρ ρ in Re. Let a, b ∈ Re, with a < b, such that [a, b] ⊆ V , ϕ(a) < ϕ(b), [ϕ(a), ϕ(b)] ⊆ U. Finally, assume that ϕ([a, b]) ⊆ [ϕ(a), ϕ(b)]. Then ϕ(b) b f(t)dt = f [ϕ(s)] · ϕ0(s)ds. ˆϕ(a) ˆa We also have a generalization of Taylor formula:
ρ k ρ Theorem 17. Let f ∈ GC (U, Re) be a generalized Ck+1 function defined in the sharply open set ρ ρ U ⊆ Red. Let a, b ∈ Red such that the line segment [a, b] ⊆ U, and set h := b − a. Then, for all n ∈ N≤k we have j n+1 Pn d f(a) j d f(ξ) n+1 (i) ∃ξ ∈ [a, b]: f(a + h) = j=0 j! · h + (n+1)! · h . j Pn d f(a) j 1 1 n n+1 n+1 (ii) f(a + h) = j=0 j! · h + n! · 0 (1 − t) d f(a + th) · h dt. ρ ´ Moreover, there exists some R ∈ Re>0 such that n X djf(a) dn+1f(ξ) ∀k ∈ B (0) ∃ξ ∈ [a, a + k]: f(a + k) = · kj + · kn+1 (2.3) R j! (n + 1)! j=0 dn+1f(ξ) 1 1 · kn+1 = · (1 − t)n dn+1f(a + tk) · kn+1 dt ≈ 0. (2.4) (n + 1)! n! ˆ0 Formulas (i) and (ii) correspond to a plain generalization of Taylor’s theorem for ordinary functions with Lagrange and integral remainder, respectively. Dealing with generalized functions, it is important to note that this direct statement also includes the possibility that the differential dn+1f(ξ) may be infinite at some point. For this reason, in (2.3) and (2.4), considering a sufficiently small increment k, we get more classical infinitesimal remainders dn+1f(ξ) · kn+1 ≈ 0. We can 0 0 0 also define right and left derivatives as e.g. f (a) := f+(a) := limt→a f (t), which always exist if a ρ ρ ∞ ρ Lemma 18. Let X ⊆ Re and let f ∈ GC (X, Re). 0 (i) If x0 ∈ X is a sharply interior local minimum of f then f (x0) = 0. ρ (ii) Let a, b ∈ Re with a < b, [a, b] ⊆ X and x0 be a sharply interior point of [a, b]. Assume that 00 0 00 x0 is a local minimum of f. Then f (x0) ≥ 0. Vice versa, if f (x0) = 0 and f (x0) > 0, then x0 is a local minimum of f. 2.3. Embedding of Sobolev-Schwartz distributions and Colombeau generalized func- tions. We finally recall two results that give a certain flexibility in constructing embeddings of Schwartz distributions. Note that both the infinitesimal ρ and the embedding of Schwartz dis- tributions have to be chosen depending on the problem we aim to solve. A trivial example in this direction is the ODE y0 = y/dε, which cannot be solved for ρ = (ε), but it has a solution for ρ = (e−1/ε). As another simple example, if we need the property H(0) = 1/2, where H is the Heaviside function, then we have to choose the embedding of distributions accordingly. See also [24, 46] for further details. n n n If ϕ ∈ D(R ), r ∈ R>0 and x ∈ R , we use the notations r ϕ for the function x ∈ R 7→ 1 x n rn · ϕ r ∈ R and x ⊕ ϕ for the function y ∈ R 7→ ϕ(y − x) ∈ R. These notations permit us n to highlight that is a free action of the multiplicative group (R>0, ·, 1) on D(R ) and ⊕ is a n free action of the additive group (R>0, +, 0) on D(R ). We also have the distributive property r (x ⊕ ϕ) = rx ⊕ r ϕ. ρ Lemma 19. Let b ∈ Re be a net such that limε→0+ bε = +∞. Let d ∈ (0, 1). There exists a net n (ψε)ε∈I of D(R ) with the properties: CALCULUS OF VARIATIONS AND OPTIMAL CONTROL FOR GF 9 (i) supp(ψε) ⊆ B1(0) and ψε is even for all ε ∈ I. n−1 2n (ii) Let ωn denote the surface area of S and set cn := for n > 1 and c1 := 1, then ωn ψε(0) = cn for all ε ∈ I. (iii) ψε = 1 for all ε ∈ I. n α p + (iv) ´∀α ∈ ∃p ∈ : sup n |∂ ψ (x)| = O(b ) as ε → 0 . N N x∈R ε ε 0 α (v) ∀j ∈ N ∀ ε : 1 ≤ |α| ≤ j ⇒ x · ψε(x)dx = 0. 0 (vi) ∀η ∈ R>0 ∀ ε : |ψε| ≤ 1 +´η. ´ 0 (vii) If n = 1, then the net (ψε)ε∈I can be chosen so that −∞ ψε = d. b −1 ´ In particular ψε := bε ψε satisfies (iii)- (vi). Concerning embeddings of Schwartz distributions, we have the following result, where c(Ω) := 0 n {[xε] ∈ [Ω] | ∃K b Ω ∀ ε : xε ∈ K} is called the set of compactly supported points in Ω ⊆ R . n b Theorem 20. Under the assumptions of Lemma 19, let Ω ⊆ R be an open set and let (ψε) be the net defined in 19. Then the mapping b 0 b ρ ∞ ρ ιΩ : T ∈ E (Ω) 7→ T ∗ ψε (−) ∈ GC (c(Ω), Re) uniquely extends to a sheaf morphism of real vector spaces b 0 ρ ∞ ρ ι : D −→ GC (c((−)), Re), and satisfies the following properties: ∞ ρ ∞ ρ −a b ∞ (i) If b ≥ dρ for some a ∈ R>0, then ι |C (−) : C (−) −→ GC (c((−)), Re) is a sheaf b ∞ morphism of algebras and ιΩ(f)(x) = f(x) for all smooth functions f ∈ C (Ω) and all x ∈ Ω; 0 b (ii) If T ∈ E (Ω) then supp(T ) = supp(ιΩ(T )); b 0 (iii) limε→0+ Ω ιΩ(T )ε · ϕ = hT, ϕi for all ϕ ∈ D(Ω) and all T ∈ D (Ω); b α b b α 0 (iv) ι commutes´ with partial derivatives, i.e. ∂ ιΩ(T ) = ιΩ (∂ T ) for each T ∈ D (Ω) and α ∈ N. Concerning the embedding of Colombeau generalized functions, we recall that the special s s Colombeau algebra on Ω is defined as the quotient G (Ω) := EM (Ω)/N (Ω) of moderate nets over negligible nets, where the former is ∞ I n α −N EM (Ω) := {(uε) ∈ C (Ω) | ∀K b Ω ∀α ∈ N ∃N ∈ N : sup |∂ uε(x)| = O(ε )} x∈K and the latter is s ∞ I n α m N (Ω) := {(uε) ∈ C (Ω) | ∀K b Ω ∀α ∈ N ∀m ∈ N : sup |∂ uε(x)| = O(ε )}. x∈K Using ρ = (ε), we have the following compatibility result: s d s d Theorem 21. A Colombeau generalized function u = (uε) + N (Ω) ∈ G (Ω) defines a GSF u : d s d ρ ∞ ρ d [xε] ∈ c(Ω) −→ [uε(xε)] ∈ Re . This assignment provides a bijection of G (Ω) onto GC (c(Ω), Re ) for every open set Ω ⊆ Rn. Example 22. ρ ∞ ρ ρ (i) Let δ, H ∈ GC ( Re, Re) be the corresponding ιb-embeddings of the Dirac delta and of the Heaviside function. Then δ(x) = b · ψ(b · x), where ψ(x) := [ψε(xε)]. We have that δ(0) = b ≥ dρ−a and δ(x) = 0 if |bx| ≥ 1 because of Lem. 19.(i). The condition |bx| ≥ 1 −1 −a surely holds e.g. if |x| > k log dρ ≈ 0 for some k ∈ R>0 because b ≥ dρ ; in particular, it is satisfied if |x| > r for some r ∈ R>0. By the intermediate value theorem (see [23]), ρ δ takes any value in the interval [0, b] ⊆ Re. Similar properties can be stated e.g. for δ2(x) = b2 · ψ(b · x)2. (ii) Analogously, we have H(x) = 1 if |bx| ≥ 1 and x > 0; H(x) = 0 if |bx| ≥ 1 and x < 0; finally 1 H(0) = 2 because of Lem. 19.(i). By the intermediate value theorem, H takes any value in ρ the interval [0, 1] ⊆ Re. 10 GAST¨I¿ŒO S.F. FREDERICO, PAOLO GIORDANO, ALEXANDR A. BRYZGALOV, AND MATHEUS J. LAZO Figure 2.1. Representations of Dirac delta and Heaviside function ρ ∞ ρ ρ 2 (iii) The composition δ ◦ δ ∈ GC ( Re, Re) is given by (δ ◦ δ)(x) = bψ b ψ(bx) and is an even function. If |bx| ≥ 1, then (δ ◦ δ)(x) = b. Since (δ ◦ δ)(0) = 0, again using the intermediate ρ value theorem, we have that δ ◦ δ takes any value in the interval [0, b] ⊆ Re. Suitably 1 1 choosing the net (ψε) it is possible to have that if − 2b ≤ x ≤ 2b (hence x is infinitesimal), k then (δ ◦ δ)(x) = 0. If x = b for some k ∈ N>0, then x is still infinitesimal but (δ ◦ δ)(x) = b. Analogously, one can deal with compositions such as H ◦ δ and δ ◦ H. See Fig. 2.1 for a graphical representations of δ and H. The infinitesimal oscillations shown in this figure occur only in an infinitesimal neighborhood of the origin because of (i) in example 22 −1 and because k log dρ ≈ 0. They can be proved to actually occur as a consequence of Lem. 19.(v) which is a necessary property to prove Thm. 20.(i), see [23, 24]. It is well-known that the latter property is one of the core ideas to bypass the Schwartz’s impossibility theorem, see e.g. [29]. 2.4. Functionally compact sets and multidimensional integration. 2.4.1. Extreme value theorem and functionally compact sets. For GSF, suitable generalizations of many classical theorems of differential and integral calculus hold: intermediate value theorem, mean value theorems, suitable sheaf properties, local and global inverse function theorems, Banach fixed point theorem and a corresponding Picard-Lindel¨ı¿œf theorem both for ODE and PDE, see [20, 22–24, 46]. Even though the intervals [a, b] ⊆ Re, a, b ∈ R, are not compact in the sharp topology (see [22]), analogously to the case of smooth functions, a ρGCk map satisfies an extreme value theorem on such sets. In fact, we have: ρ k Theorem 23. Let f ∈ GC (X, Re) be a generalized Ck function defined on the subset X of Ren. Let ∅= 6 K = [Kε] ⊆ X be an internal set generated by a sharply bounded net (Kε) of compact sets n Kε b R , then ∃m, M ∈ K ∀x ∈ K : f(m) ≤ f(x) ≤ f(M). (2.5) CALCULUS OF VARIATIONS AND OPTIMAL CONTROL FOR GF 11 We shall use the assumptions on K and (Kε) given in this theorem to introduce a notion of “compact subset” which behaves better than the usual classical notion of compactness in the sharp topology. n n Definition 24. A subset K of Re is called functionally compact, denoted by K bf Re , if there exists a net (Kε) such that n (i) K = [Kε] ⊆ Re . ρ (ii) ∃R ∈ Re>0 : K ⊆ BR(0), i.e. K is sharply bounded. n (iii) ∀ε ∈ I : Kε b R . n If, in addition, K ⊆ U ⊆ Re then we write K bf U. Finally, we write [Kε] bf U if (ii), (iii) and [Kε] ⊆ U hold. Any net (Kε) such that [Kε] = K is called a representative of K. We motivate the name functionally compact subset by noting that on this type of subsets, GSF have properties very similar to those that ordinary smooth functions have on standard compact sets. Remark 25. (i) By Thm.7.(iii), any internal set K = [Kε] is closed in the sharp topology. In particular, the open interval (0, 1) ⊆ Re is not functionally compact since it is not closed. n (ii) If H b R is a non-empty ordinary compact set, then the internal set [H] is functionally compact. In particular, [0, 1] = [[0, 1]R] is functionally compact. (iii) The empty set ∅ = e∅ bf Re. (iv) Ren is not functionally compact since it is not sharply bounded. (v) The set of compactly supported points c(R) is not functionally compact because the GSF f(x) = x does not satisfy the conclusion (2.5) of Thm. 23. In the present paper, we need the following properties of functionally compact sets. n ρ k d n d Theorem 26. Let K ⊆ X ⊆ Re , f ∈ GC (X, Re ). Then K bf Re implies f(K) bf Re . As a corollary of this theorem and Rem. (25).(ii) we get Corollary 27. If a, b ∈ Re and a ≤ b, then [a, b] bf Re. Let us note that a, b ∈ Re can also be infinite numbers, e.g. a = dρ−N , b = dρ−M or a = −dρ−N , b = dρ−M with M > N, so that e.g. [−dρ−N , dρM ] ⊇ R. Finally, in the following result we consider the product of functionally compact sets: n d n+d Theorem 28. Let K bf Re and H bf Re , then K × H bf Re . In particular, if ai ≤ bi for Qn n i = 1, . . . , n, then i=1[ai, bi] bf Re . A theory of compactly supported GSF has been developed in [20], and it closely resembles the classical theory of LF-spaces of compactly supported smooth functions. It establishes that for suit- able functionally compact subsets, the corresponding space of compactly supported GSF contains extensions of all Colombeau generalized functions, and hence also of all Schwartz distributions. As in the classical case (see e.g. [19]), thanks to the extreme value theorem 23 and the property ρ k ρ of functionally compact sets K, we can naturally define a topology on the space GC (K, Red): ◦ ρ n Definition 29. Let K bf Re be a functionally compact set such that K = K (so that partial derivatives at sharply boundary points can be defined as limits of partial derivatives at sharply ρ k ρ d interior points; such K are called solid sets). Let l ∈ N≤k and v ∈ GC (K, Re ). Then α i α i ρ kvkl := max max ∂ v (Mni) , ∂ v (mni) ∈ Re, |α|≤l 1≤i≤d where Mni, mni ∈ K satisfy α i α i α i ∀x ∈ K : ∂ v (mni) ≤ ∂ v (x) ≤ ∂ v (Mni). 12 GAST¨I¿ŒO S.F. FREDERICO, PAOLO GIORDANO, ALEXANDR A. BRYZGALOV, AND MATHEUS J. LAZO The following result (see [41]) permits us to calculate the (generalized) norm kvkl using any net (vε) that defines v. ρ n Lemma 30. Under the assumptions of Def. 29, let [Kε] = K bf Re be any representative of K. Then we have: α i ρ (i) If the net (vε) defines v, then kvkl = max |α|≤l maxx∈Kε ∂ vε(x) ∈ Re; 1≤i≤d (ii) kvkl ≥ 0; (iii) kvkl = 0 if and only if v = 0; ρ (iv) ∀c ∈ Re : kc · vkl = |c| · kvkl; ρ k ρ d (v) For all u ∈ GC (K, Re ), we have ku + vkl ≤ kukl + kvkl and ku · vkl ≤ cl · kukl · kvkl for ρ some cl ∈ Re>0. ρ ρ k ρ Using these Re-valued norms, we can naturally define a topology on the space GC (K, Red). ρ n ρ k ρ d ρ Definition 31. Let K bf Re be a solid set. Let l ∈ N≤k, u ∈ GC (K, Re ), r ∈ Re>0, then l n ρ k ρ d o (i) Br(u) := v ∈ GC (K, Re ) | kv − ukl < r ρ k ρ (ii) If U ⊆ GC (K, Red), then we say that U is a sharply open set if ρ l ∀u ∈ U ∃l ∈ N≤k ∃r ∈ Re>0 : Br(u) ⊆ U. One can easily prove that sharply open sets form a sequentially Cauchy complete topology on ρ k ρ d ρ k ρ d GC (K, Re ), see e.g. [21, 47]. The structure GC (K, Re ), (k−kl)l≤k has the usual properties ρ of a graded Fr¨ı¿œchet space if we replace everywhere the field R with the ring Re, and for this ρ reason it is called an Re-graded Fr¨ı¿œchet space. 2.4.2. Multidimensional integration. Finally, to deal with higher-order calculus of variations, we ρ have to introduce multidimensional integration of GSF on suitable subsets of Ren (see [23]). ρ Definition 32. Let µ be a measure on Rn and let K be a functionally compact subset of Ren. Then, we call K µ-measurable if the limit E µ(K) := lim [µ(B ρm (Kε))] (2.6) m→∞ ε exists for some representative (Kε) of K. Here m ∈ N, the limit is taken in the sharp topology on ρ ρ n since [µ(BE m (K ))] ∈ , and BE (A) := {x ∈ : d(x, A) ≤ r}. Re ρε ε Re r R In the following result, we show that this definition generates a correct notion of multidimensional integration for GSF. ρ Theorem 33. Let K ⊆ Ren be µ-measurable. (i) The definition of µ(K) is independent of the representative (Kε). (ii) There exists a representative (Kε) of K such that µ(K) = [µ(Kε)]. ρ ∞ ρ (iii) Let (Kε) be any representative of K and let f ∈ GC (K, Re) be a GSF defined by the net (fε). Then ρ f dµ := lim fε dµ ∈ e m→∞ R ˆK ˆBE m (K ) ρε ε exists and its value is independent of the representative (Kε). (iv) There exists a representative (Kε) of K such that ρ f dµ = fε dµ ∈ Re (2.7) ˆK ˆKε ρ ∞ ρ for each f ∈ GC (K, Re). CALCULUS OF VARIATIONS AND OPTIMAL CONTROL FOR GF 13 Qn n (v) If K = i=1[ai, bi], then K is λ-measurable (λ being the Lebesgue measure on R ) and for ρ ∞ ρ all f ∈ GC (K, Re) we have " b1,ε bn,ε # ρ f dλ = dx1 ... fε(x1, . . . , xn) dxn ∈ Re ˆK ˆa1,ε ˆan,ε for any representatives (ai,ε), (bi,ε) of ai and bi respectively. Therefore, if n = 1, this notion of integral coincides with that of Thm. 13 and Def. 14. ρ ρ ∞ ρ (vi) Let K ⊆ Ren be λ-measurable, where λ is the Lebesgue measure, and let ϕ ∈ GC (K, Red) ρ ∞ ρ be such that ϕ−1 ∈ GC (ϕ(K), Ren). Then ϕ(K) is λ-measurable and f dλ = (f ◦ ϕ) |det(dϕ)| dλ ˆϕ(K) ˆK ρ ∞ ρ for each f ∈ GC (ϕ(K), Re). 3. Higher-order calculus of variations for generalized functions In this section we prove for GSF the higher-order Euler-Lagrange equation, the du Bois- Reymond optimality condition and the Noether’s theorem. We start with some definition of basic notions and notations. Using the sharp topology of Def. 31, we can define when a curve is a minimizer of a given functional. Note explicitly that there are no ρ restrictions on the generalized numbers a, b ∈ Re, a < b, e.g. they can also both be infinite. ρ Definition 34. Let t1, t2 ∈ Re, with t1 < t2, and let m ∈ N>0, then (i) For all q := q0 , . . . , qm−1, q := q0 , . . . , qm−1 ∈ ρ d×m, we set t1 t1 t1 t2 t2 t2 Re n o ρGC∞ (q , q ; m) := q ∈ ρGC∞([t , t ], ρ d) | q(i)(t ) = qi ∀i = 0, . . . , m − 1, j = 1, 2 . bd t1 t2 1 2 Re j tj ρ ∞ ρ ∞ ρ ∞ ρ ∞ We simply set GC0 (m) := GCbd(0, 0; m) and GC0 := GCbd(0, 0; 1). The subscript “bd” stands here for “boundary values”. ρ ρ ∞ ρ d ρ ∞ ρ d×(m+1) ρ (ii) Let t1, t2 ∈ Re with t1 < t2. Let q ∈ GC ([t1, t2], Re ) and L ∈ GC ([t1, t2]× Re , Re) and define m (1) (m) [q] (t) := t, q(t), q (t), . . . , q (t) ∀t ∈ [t1, t2] (3.1) m (1) (m) L[q] (t) :=L t, q(t), q (t), . . . , q (t) ∀t ∈ [t1, t2] (3.2) t2 m ρ J[q] := L[q] (t) dt ∈ Re. (3.3) ˆt1 Note explicitly that Thm. (iv).9 (closure of GSF with respect to composition) and Def. 14 of 1-dimensional integral of GSF (i.e. Thm. 13) allows us to say that J(q) is a well-defined ρ number in Re. ρ ∞ ρ ∞ (iii) We say that q is a local minimizer of J in GCbd(qt1 , qt2 ; m) if q ∈ GCbd(qt1 , qt2 ; m) and ρ l ρ ∞ ∃r ∈ Re>0 ∃l ∈ N ∀p ∈ Br(q) ∩ GCbd(qt1 , qt2 ; m): J(p) ≥ I(q) (3.4) ρ ∞ ρ (iv) Let q ∈ GC [t1, t2], Re . We define the first and second variation of J in direction ρ ∞ h ∈ GC0 (m) at q as 2 d 2 d δJ(q; h) := J(q + sh) δ J(q; h) := 2 J(q + sh) . ds s=0 d s s=0 ρ ∞ The GSF q is called weak extremal of J if δJ(q; h) = 0 for all h ∈ GC0 (m). 14 GAST¨I¿ŒO S.F. FREDERICO, PAOLO GIORDANO, ALEXANDR A. BRYZGALOV, AND MATHEUS J. LAZO ρ ∞ ρ ρ ∞ ρ d×(m+1) ρ (v) More generally, if q ∈ GC [t1, t2], Re and Q ∈ GC ([t1, t2]× Re , Re), we say that ρ ∞ J satisfies at q the D’Alembert’s principle with generalized forces Q if for all h ∈ GC0 (m): t2 t s1 sm−1 ! (m) m δJ(q; h) = h (t) · ... Q[q] (sm) dsm ... ds2 ds1 dt. ˆt1 ˆt1 ˆt1 ˆt1 | {z } m times The following results establish classical necessary and sufficient conditions to decide if a function u is a local minimizer for the functional (3.3). ρ ρ ∞ ρ d×(m+1) ρ ρ d Theorem 35. Let t1, t2 ∈ Re with t1 < t2, let L ∈ GC ([t1, t2]× Re , Re), let qt1 , qt2 ∈ Re ρ ∞ and let q be a local minimizer of J in GCbd(qt1 , qt2 ; m). Then ρ ∞ (i) δJ(q; h) = 0 for all h ∈ GC0 (m); 2 ρ ∞ (ii) δ J(q; h) ≥ 0 for all h ∈ GC0 (m). ρ ρ ∞ ρ Proof. Let r ∈ Re>0 be such that (3.4) holds. Since h ∈ GC0 (m), the map s ∈ Re 7→ q + sh ∈ ρ ∞ GCbd(qt1 , qt2 ; m) is well defined and continuous with respect to the trace of the sharp topology in ρ l ρ ∞ its codomain. Therefore, we can findr ¯ ∈ Re>0 such that q + sh ∈ Br(q) ∩ GCbd(qt1 , qt2 ; m) for all ρ s ∈ Br¯(0). We hence have J(q +sh) ≥ J(q). This shows that the GSF s ∈ Br¯(0) 7→ J(q +sh) ∈ Re has a local minimum at s = 0. Now, we apply Lem. 18 and thus the claims are proven. ρ ρ d ρ ∞ Theorem 36. Let t1, t2 ∈ Re with t1 < t2 and qt1 , qt2 ∈ Re . Let q ∈ GCbd(qt1 , qt2 ; m) be such that ρ ∞ (i) δJ(q; h) = 0 for all h ∈ GC0 (m). 2 ρ ∞ l ρ ∞ ρ (ii) δ J(v; h) ≥ 0 for all h ∈ GC0 (m) and for all v ∈ Br(q) ∩ GCbd(qt1 , qt2 ; m), where r ∈ Re>0 and l ∈ N. ρ ∞ 2 Then q is a local minimizer of the functional J in GCbd(qt1 , qt2 ; m). Moreover, if δ J(v; h) > 0 for ρ ∞ l ρ ∞ all h ∈ GC0 (m) such that khkl > 0 and for all v ∈ B2r(u) ∩ GCbd(qt1 , qt2 ; m), then J(v) > J(q) l ρ ∞ for all v ∈ Br(q) ∩ GCbd(qt1 , qt2 ; m) such that kv − qkl > 0. l ρ ∞ ρ Proof. For any v ∈ Br(q) ∩ GCbd(qt1 , qt2 ; m), we set ψ(s) := J(q +s(v −q)) ∈ Re for all s ∈ B1(0), l ρ ∞ so that q + s(v − q) ∈ Br(q). Since (v − q)(t1) = 0 = (v − q)(t2), we have v − q ∈ GC0 (m), and properties (i), (ii) yield ψ0(0) = δJ(q; v − q) = 0 and ψ00(s) = δ2J(q + s(v − q); v − q) ≥ 0 for all s ∈ B1(0). We claim that s = 0 is a minimum of ψ. In fact, for all s ∈ B1(0) by Taylor’s Thm. 17 s2 ψ(s) = ψ(0) + sψ0(0) + ψ00(ξ) 2 0 s2 00 for some ξ ∈ [0, s]. But ψ (0) = 0 and hence ψ(s) − ψ(0) = 2 ψ (ξ) ≥ 0. Finally, since by Thm.7.(iii) the interval [0 , +∞) is closed in the sharp topology lim ψ(s) = J(v) ≥ ψ(0) = J(u), s→1− 2 ρ ∞ which is our conclusion. Note explicitly that if δ J(v; h) = 0 for all h ∈ GC0 (m) and for all l ρ ∞ 00 v ∈ Br(u) ∩ GCbd(qt1 , qt2 ; m), then ψ (ξ) = 0 and hence J(v) = J(q). 2 ρ ∞ Now, assume that δ J(v; h) > 0 for all h ∈ GC0 (m) such that khkl > 0 and for all v ∈ l ρ ∞ l ρ ∞ B2r(u) ∩ GCbd(qt1 , qt2 ; m), and take v ∈ Br(u) ∩ GCbd(qt1 , qt2 ; m) such that kv − qkl > 0. As ρ l above set ψ(s) := J(q + s(v − q)) ∈ Re for all s ∈ B3/2(0), so that q + s(v − q) ∈ B2r(q). We 0 00 2 have ψ (0) = 0 and ψ (s) = δ J(q + s(v − q); v − q) > 0 for all s ∈ B3/2(0) because kv − ukl > 0. 1 00 Using Taylor’s theorem, we get ψ(1) = ψ(0) + 2 ψ (ξ) for some ξ ∈ [0, 1]. Therefore ψ(1) − ψ(0) = 1 00 J(v) − J(q) = 2 ψ (ξ) > 0. In the following, we always assume the notations of Def. 34. Moreover, we also set ρ n ρ o Red[t] := p ∈ Re[t] | deg(p) ≤ d ρ for the set of all the polynomials with coefficients in the ring Re having degree less of equal to d ∈ N. CALCULUS OF VARIATIONS AND OPTIMAL CONTROL FOR GF 15 In order to prove the higher-order version of the fundamental lemma, we need the following ρ ρ ∞ ρ b Lemma 37. Let a, b ∈ Re be such that a < b, and let f ∈ GC [a, b], Re≥0 . If a f(t) dt = 0, then f = 0. ´ Proof. By contradiction, assume that f(x) 6= 0 for some x ∈ [a, b]. Let (fε) be a net of smooth functions that define the GSF f and let [xε] = x be a representative of x. The property 0 q ∀q ∈ N ∀ ε : fε(xε) ≤ ρε (3.5) implies f(x) ≤ dρq and hence also, letting q → +∞, f(x) ≤ 0. Since f(x) ≥ 0, this would imply f(x) = 0. Therefore, taking the negation of (3.5) we get q ∃q ∈ N ∃L ⊆ I : 0 ∈ L, ∀ε ∈ L : fε(xε) > ρε, ¯ ¯ 1 q where L is the closure of L in I. Set fε := fε for ε ∈ L and fε := 2 (fε + ρε) otherwise. Directly ¯ ¯ ρ ∞ ρ ¯ q+1 from Def.8, it follows that f := fε(−) ∈ GC ([a, b], Re≥0) and f(x) > dρ . The sharp continuity of f¯ at x (Thm.9.(ii)) implies ¯ q+1 ∃c, d : c < d, [c, d] ⊆ [a, b], f|[c,d] > dρ . (3.6) Moreover, from f ≥ 0 and Thm. 15 we obtain b c d b d 0 = f ≥ f + f + f ≥ f. ˆa ˆa ˆc ˆd ˆc dε Thereby, we can find a negligible net [zε] = 0 such that zε ≥ fε for all ε sufficiently small, cε where [cε] = c,[dε] = d with cε < dε because c < d (see Lem.´ 3). In particular, for ε ∈ L ¯ dε dε ¯ sufficiently small fε = fε and hence fε = fε ≤ zε. But then (3.6) and Thm. 15.(vii) yield cε cε ´ ´ dε dε 0 ¯ q+1 q+1 ∀ ε ∈ L : zε ≥ fε ≥ ρε = ρε (dε − cε) . ˆcε ˆcε h i Therefore, ρq+1 ≤ zε . However, zε = [zε] = 0 and hence ρq+1 ≤ ρq+2 by (2.1), a ε dε−cε dε−cε d−c ε ε contradiction. 3.1. Higher-order Euler–Lagrange equation, D’Alembert principle and du Bois–Rey- mond optimality condition. In this section, we first compute the first variation δJ(q; h) and thereby we deduce the Euler-Lagrange equations both in integral and differential forms, and the D’Alembert principle in differential form. ρ ∞ ρ ρ ∞ Lemma 38. If q ∈ GC [t1, t2], Re and h ∈ GC0 (m), then " m !# t2 X t s1 sm−i−1 δJ(q; h) = h(m)(t)· (−1)i ... ∂ L[q]m(s ) ds ... ds ds dt. ˆ ˆ ˆ ˆ i+2 m−i m−i 2 1 t1 i=0 t1 t1 t1 | {z } m−i times ρ ∞ Proof. According to Def. 34, and to Thm. 15 for any h ∈ GC0 (m) we have m ! t2 X δJ(q; h) = ∂ L[q]m(t) · h(i)(t) dt. (3.7) ˆ i+2 t1 i=0 16 GAST¨I¿ŒO S.F. FREDERICO, PAOLO GIORDANO, ALEXANDR A. BRYZGALOV, AND MATHEUS J. LAZO By repeated integration by parts (Thm. 15.(vi)) one has m X t2 ∂ L[q]m(t) · h(i)(t) dt ˆ i+2 i=0 t1 m ("m−i !#t2 X X t s1 sj−1 = (−1)j+1h(i+j−1)(t) · ... ∂ L[q]m(s ) ds ... ds ds ˆ ˆ ˆ i+2 j j 2 1 i=0 j=1 t1 t1 t1 | {z } t1 j times t2 t s1 sm−i−1 ! ) i (m) m + (−1) h (t) · ... ∂i+2L[q] (sm−i) dsm−i ... ds2 ds1 dt (3.8) ˆt1 ˆt1 ˆt1 ˆt1 | {z } m−i times (i) (i) Since h (t1) = 0 = h (t2), i = 0, . . . , m − 1, the first summands in (3.8) vanish. Therefore, equations (3.7) and (3.8) yield the conclusion. In order to prove the Euler-Lagrange equation in integral form (see below Cor. 41), we first need to show the higher order du Bois–Reymond lemma of the calculus of variations for generalized functions. ρ Lemma 39 (Fundamental higher order du Bois–Reymond lemma). Let a, b ∈ Re be such that ρ ∞ ρ a < b, and let f ∈ GC [a, b], Re . If b (m) ρ ∞ f(t)h (t) dt = 0 for all h ∈ GC0 (m) (3.9) ˆa ρ Then f(t) ∈ Rem−1[t]. Proof. We generalize the proof of [63, Lem. 4.1]. Considering the change of variable t 7→ f(t + a), without loss of generality we may assume a = 0. Note that the last apparently trivial step is correct because our generalized functions are ordinary set-theoretical functions (and hence their pointwise evaluation is obvious; it is well-known, see e.g. [45], that this is not the case for Sobolev- Schwartz distributions), and because GSF are close with respect to composition, Thm.9 (e.g. this step would be possible for Colombeau generalized functions only if a is finite, see e.g. [29]). The composition of primitives t x1 x2 xm−1 H(t) := ... f(x) dx1 ... dxn−1 dx ∀t ∈ [0, b] (3.10) ˆ0 ˆ0 ˆ0 ˆ0 ρ ∞ ρ is a GSF H ∈ GC ([0, b], Re) because of Def. 14 and of Thm.9. By repeated application of Def. 14 we have that H(m)(t) = f(t) for all t ∈ [0, b] and H(k)(0) = 0 for k = 0, . . . , m − 1. We note that b > 0, so that b is invertible, and we can recursively define −m z0 := b H(b) k −m (k+1) X −j−1 zk+1 := b H (b) − (k + 1) m(m − 1) · ... · (m − j)b · zk−j ∀k = 0, . . . , m − 2. j=0 (3.11) Pm−1 zk k ρ m ρ (m) Finally, set z(t) := k=0 k! (t − b) ∈ Rem−1[t], P (t) := t z(t) ∈ Rem−1[t], p(t) := P (t) ∈ ρ (k) (k) (k) Rem−1[t] and h(t) := H(t) − P (t). In this way, we have h (0) = H (0) − P (0) = 0, hence ρ ∞ h ∈ GC0 (m), and, from (3.11): k−1 (k) (k) m X m−j−1 h (b) = H (b) − b zk − k m(m − 1) · ... · (m − j)b · zk−1−j = 0 ∀k = 0, . . . , m − 1. j=0 CALCULUS OF VARIATIONS AND OPTIMAL CONTROL FOR GF 17 By these boundary equalities of h at 0 and b, and by repeated integration by parts, we get b b b p(t)h(m)(t) dt = − p0(t)h(m−1)(t) dt = ··· = p(m)(t)h(t) dt = 0 (3.12) ˆ0 ˆ0 ˆ0 ρ (m) 2 (m) because p ∈ Rem−1[t]. Since h (t) = f(t) − p(t), we have (f(t) − p(t)) = (f(t) − p(t))h (t) and hence (3.12), Thm. 15.(vii) and assumption (3.9) yield b b b 0 ≤ (f(t) − p(t))2 dt = (f(t) − p(t))h(m)(t) dt = f(t)h(m)(t) dt = 0. ˆ0 ˆ0 ˆ0 Therefore, the conclusion f(t) = p(t) follows from Lem. 37. As a simple consequence, we have the following ρ ρ ∞ ρ Corollary 40. Let a, b ∈ Re be such that a < b, and let f, g ∈ GC ([a, b], Re). If b 0 ρ ∞ (f(t)h(t) + g(t)h (t)) dt = 0 for all h ∈ GC0 , ˆa then g0 = f. In particular, for g = 0, if b ρ ∞ f(t)h(t) dt = 0 for all h ∈ GC0 , ˆa then f = 0. x ρ ∞ ρ 0 Proof. Set F (x) := a f(t) dt for x ∈ [a, b], so that F ∈ GC ([a, b], Re) and F = f by Def. 14. Integrating by parts´ (Thm. 15.(vi)), we get b b (f(t)h(t) + g(t)h0(t)) dt = (g(t) − F (t)) h0(t) dt = 0 ˆa ˆa ρ ∞ ρ for all h ∈ GC0 . Therefore, Lem. 39 implies that g(t) − F (t) is a constant (in Re), and hence 0 0 g = F = f as claimed. Applying the higher-order du Bois–Reymond Lem. 39, we arrive at the first form of the Euler- Lagrange equations: ρ ∞ ρ Corollary 41. If q ∈ GC [t1, t2], Re is a weak extremal of functional (3.3), then q satisfies the following higher-order Euler–Lagrange integral equation: m t s1 sm−i−1 ! X i m ρ (−1) ... ∂i+2L[q] (t) dsm−i ... ds1 ∈ em−1[t]. (3.13) ˆ ˆ ˆ R i=0 t1 t1 t1 | {z } m−i times Taking the derivative of order m to (3.13), we obtain ρ ∞ ρ Corollary 42 (Higher-order Euler–Lagrange equations in differential form). If q ∈ GC [t1, t2], Re is a weak extremal of functional (3.3), then m X di (−1)i ∂ L[q]m(t) = 0 ∀t ∈ [t , t ]. (3.14) dti i+2 1 2 i=0 Finally, Lem. 38 and Def. 34.(v) yield the D’Alembert principle in differential form: ρ ∞ ρ ρ ∞ ρ d×(m+1) ρ Corollary 43. Let q ∈ GC [t1, t2], Re , Q ∈ GC ([t1, t2] × Re , Re) and assume that J satisfies at q the D’Alembert’s principle with generalized forces Q, then m X di (−1)i ∂ L[q]m(t) = Q[q]m(t) ∀t ∈ [t , t ]. dti i+2 1 2 i=0 18 GAST¨I¿ŒO S.F. FREDERICO, PAOLO GIORDANO, ALEXANDR A. BRYZGALOV, AND MATHEUS J. LAZO Proof. In fact, we have t2 t s1 sm−1 ! (m) m δJ(q; h) − h (t) · ... Q[q] (sm) dsm ... ds2 ds1 dt = ˆt1 ˆt1 ˆt1 ˆt1 | {z } m times m ! t2 X t s1 sm−i−1 = h(m)(t) · (−1)i ... ∂ L[q]m(s ) ds ... ds ds − ˆ ˆ ˆ ˆ i+2 m−i m−i 2 1 t1 i=0 t1 t1 t1 | {z } m−i times t s1 sm−1 ! m − ... Q[q] (sm) dsm ... ds2 ds1 dt = 0. ˆt1 ˆt1 ˆt1 | {z } m times Thereby, Lem. 39 gives m ! X t s1 sm−i−1 (−1)i ... ∂ L[q]m(s ) ds ... ds ds − ˆ ˆ ˆ i+2 m−i m−i 2 1 i=0 t1 t1 t1 | {z } m−i times t s1 sm−1 ! m ρ − ... Q[q] (sm) dsm ... ds2 ds1 ∈ Rem−1[t]. ˆt1 ˆt1 ˆt1 | {z } m times Taking the derivative of order m yields the conclusion. ρ ∞ ρ Associated to a given function q ∈ GC [t1, t2], Re , it is convenient to introduce the following quantities (see [18]): m−j X di ϕj(q)(t) := (−1)i ∂ L[q]m(t) ∀j = 0, . . . , m ∀t ∈ [t , t ] (3.15) dti i+j+2 1 2 i=0 These operators are useful for our purposes because of the following properties: d ϕj(q) = ∂ L[q]m − ϕj−1(q) ∀j = 1, . . . , m. (3.16) dt j+1 We are now in conditions to prove a higher-order du Bois–Reymond optimality condition. ρ ∞ ρ d Theorem 44. If q ∈ GC [t1, t2], Re is a weak extremal of functional (3.3), then m d X L[q]m − ϕj(q) · q(j) = ∂ L[q]m. (3.17) dt 1 j=1 Proof. Using, for simplicity, ϕj := ϕj(q), we have m d X L[q]m − ϕj · q(j) = dt j=1 m m m X m (j+1) X j (j) j (j+1) = ∂1L[q] + ∂j+2L[q] · q − ϕ˙ · q + ϕ · q . (3.18) j=0 j=1 CALCULUS OF VARIATIONS AND OPTIMAL CONTROL FOR GF 19 Using (3.16), equation (3.18) becomes m m d X X L[q]m − ϕj · q(j) = ∂ L[q]m + ∂ L[q]m · q(j+1) dt 1 j+2 j=1 j=0 m X m j−1 (j) j (j+1) − ∂j+1L[q] − ϕ · q + ϕ · q . (3.19) j=1 We now simplify the second term on the right-hand side of (3.19), which forms a telescopic sum: m X m j−1 (j) j (j+1) ∂j+1L[q] − ϕ · q + ϕ · q = j=1 m−1 X m j (j+1) j+1 (j+2) = ∂j+2L[q] − ϕ · q + ϕ · q = j=0 m−1 X m (j+1) 0 m (m+1) = ∂j+2L[q] · q − ϕ · q˙ + ϕ · q . (3.20) j=0 Substituting (3.20) into (3.19) and using the higher-order Euler–Lagrange equations (3.14), and since, by definition, m m ϕ = ∂m+2L[q] and m X di ϕ0 = (−1)i ∂ L[q]m = 0 , dti i+2 i=0 we obtain the intended result, that is, m d X L[q]m − ϕj · q(j) = dt j=1 m m (m+1) = ∂1L[q] + ∂m+2L[q] · q 0 m (m+1) m + ϕ · q˙ − ϕ · q = ∂1L[q] . 3.2. Higher-order Noether’s theorem. In our approach to Noether’s theorem, we can use the ρ non-Archimedean language of Re to formalize some infinitesimal properties frequently informally used in this topic. ρ k ρ ∞ Definition 45. Let D ⊆ Re be a sharply open set. We say that Ψ = {ψ(s, ·)}s∈P ∈ GC (D,D) is a one parameter group of diffeomorphisms of D if it satisfies: (i) P is a sharply open set, 0 ∈ P and ψ ∈ ρGC∞(P × D,D); (ii) For each s ∈ P , the map ψ(s, ·) ∈ ρGC∞(D,D) is invertible, and ψ(s, ·)−1 ∈ ρGC∞(D,D); (iii) ψ(0, ·) = IdD; (iv) ∀s, s0 ∈ P : s + s0 ∈ P ⇒ ψ(s, ·) ◦ ψ(s0, ·) = ψ(s + s0, ·). ρ We will see on Sec. 5.5 the actual usefulness of the case D ⊂ Rek. Note that, because of properties (i) and (iii), from Taylor’s formula Thm. 17, we get ρ ∂ψ ∃S ∈ e>0 ∀s ∈ [0,S] ∩ P : ψ(s, q(t)) ≈ q(t) + s (0, q(t)). R ∂s Therefore, for a sufficiently small infinitesimal s, we can always say that ψ(s, t) is infinitely close to ∂ψ a transformation of the form q(t) 7→ q(t) + sη(q(t)), where η(q) := ∂s (0, q). To write an equality ∂ψ sign instead of the infinitely close sign ≈, we can use Thm. 11 and the notation ∂s [0, q(t); s] for 20 GAST¨I¿ŒO S.F. FREDERICO, PAOLO GIORDANO, ALEXANDR A. BRYZGALOV, AND MATHEUS J. LAZO the generalized partial incremental ratio with respect to the variable s (we recall that it is another GSF): ρ ∂ψ ∃S ∈ e>0 ∀s ∈ [0,S] ∩ P : ψ(s, q(t)) = q(t) + s [0, q(t); s] R ∂s ∂ψ ∂ψ [0, q(t); s] ≈ (0, q(t)). ∂s ∂s ρ ∞ 0 0 ρ ∞ 0 0 Definition 46. Let both T = {τ(s, ·)}s∈P ∈ GC (T ,T ) and S = {σ(s, ·)}s∈P ∈ GC (S ,S ) 0 0 ρ d be one parameter groups of diffeomorphims on the open sets T ⊆ [t1, t2] and S ⊆ Re respectively. The functional (3.3) is said to be invariant under the action of T and S, if for any weak extremal ρ ∞ 0 q ∈ GC ([t1, t2],S ) it satisfies dσ(s, q(t)) dmσ(s, q(t)) ∂τ L[q]m(t) = L τ(s, t), σ(s, q(t)), ,..., (s, t) (3.21) dτ(s, t) dτ m(s, t) ∂t i 0 d σ(s,q(t)) for all s ∈ P and all t ∈ T , where the expressions dτ i(s,t) , i = 1, . . . , m are defined in the following Rem. 47.(ii). Remark 47. (i) From Def. 45.(ii) we have τ s, τ(s, ·)−1(t) = t, thereby the chain rule Thm. 12 yields that ∂τ 0 ∂t (s, t) is invertible for all s ∈ P , t ∈ T . diσ(s,q(t)) (ii) Based on the previous remark, in (3.21) the expressions dτ i(s,t) , i = 1, . . . , m are defined as dσ(s, q(t)) ∂σ(s,q(t)) (s, t) := ∂t , (3.22) dτ(s, t) ∂τ ∂t (s, t) and di−1σ(s,q(t)) i ∂ d σ(s, q(t)) ∂t dτ i−1(s,t) (s, t) := , ∀i = 2, . . . , m. (3.23) dτ i(s, t) ∂τ ∂t (s, t) i d σ(0,q(t)) (i) As usual, because of Def. 45.(iii), we have dτ i(0,t) = q (t). (iii) Using again the generalized incremental ratios, i.e Thm. 11, for all s ∈ P , all t ∈ T 0 and for ρ h ∈ Re sufficiently small, we have ∂σ(s, q(t)) σ(s, q(t + h)) = σ(s, q(t)) + h · [s, t; h] ∂t ∂τ τ(s, t + h) = τ(s, t) + h · [s, t; h]. ∂t Therefore, the ratio (of differentials): σ(s, q(t + h)) − σ(s, q(t)) ∂σ(s,q(t)) [s, t; h] ∂σ(s,q(t)) (s, t) = ∂t ≈ ∂t τ(s, t + h) − τ(s, t) ∂τ ∂τ ∂t [s, t; h] ∂t (s, t) for all sufficiently small invertible h. (iv) On the basis of Def. 14, condition (3.21) is equivalent to tb tb m m dσ(s, q(t)) d σ(s, q(t)) ∂τ L[q] (t) dt = L τ(s, t), σ(s, q(t)), ,..., m (s, t) dt ˆta ˆta dτ(s, t) dτ (s, t) ∂t 0 for all s ∈ P and for any subinterval [ta, tb] ⊆ T with ta < tb, even if L can also be a distribution. Clearly, if L is not a continuous function, this equivalence is due to the ρ ∞ ρ regularization process inherent to the embedding ιb : D0 −→ GC (c(−), Re) of Sec. 2.3. The next lemma gives a necessary condition for the functional (3.3) to be invariant under ρ ∞ 0 0 the action of the parameter groups of diffeomorphisms T = {τ(s, ·)}s∈P ∈ GC (T ,T ) and ρ ∞ 0 0 S = {σ(s, ·)}s∈P ∈ GC (S ,S ). CALCULUS OF VARIATIONS AND OPTIMAL CONTROL FOR GF 21 Lemma 48. If the functional (3.3) is invariant in the sense of Def. 46, then m ∂L[q]m ∂τ ∂τ X ∂L[q]m d ∂τ (0, t) + (0, t) · ηi(q, ·) + L[q]m (0, ·) = 0 ∀t ∈ T 0 (3.24) ∂t ∂t ∂t ∂q(i) dt ∂s i=0 ρ ∞ 0 for any weak extremal q ∈ GC ([t1, t2],S ). In (3.24), we set ( 0 ∂σ η (q, ·) := ∂s (0, q(·)), i−1 (3.25) i dη (q,·) (i) d ∂τ η (q, ·) := dt − q dt ∂s (0, ·), ∀i = 1, . . . , m. Proof. Differentiating (3.21) with respect to s at s = 0 ∈ P and t ∈ T 0 and using Rem 47.(ii), we obtain m ∂L[q]m ∂τ ∂τ X ∂L[q]m ∂ diσ(s, q(t)) ∂2τ (0, t) + (0, t) · + L[q]m(t) (0, t) = 0. (3.26) ∂t ∂t ∂t ∂q(i) ∂s dτ i(s, t) ∂s∂t i=0 s=0 Note explicitly that using two times Thm. 11, for any GSF f we always have 2 2 2 ∂ f ∂fε ∂fε (x0, y0) = (x0ε, y0ε) = (x0ε, y0ε) = ∂x∂y ∂xε∂yε ∂yε∂xε 2 ∂ f d ∂f d ∂f = (x0, y0) = (x, y0) = (x0, y0). ∂y∂x dx ∂y dx ∂y x0 Using Rem. 47.(ii) and (3.25), we have 2 2 ∂ dσ(s, q(t)) ∂ σ(s, q(t)) ∂ τ = (0, t) − q˙(t) (0, t) (3.27) ∂s dτ(s, t) s=0 ∂s∂t ∂s∂t 2 d ∂σ(s, q(t)) ∂ τ = (t) − q˙(t) (0, t) (3.28) dt ∂s s=0 ∂s∂t dη0(q, t) ∂2τ = − q˙(t) (0, t) = η1(q, t), (3.29) dt ∂s∂t and for all i = 2, . . . , m ! ! ∂ diσ(s, q(t)) ∂2 di−1σ(s, q(t)) ∂2τ = − q(i)(t) (0, t) ∂s dτ i(s, t) ∂s∂t dτ i−1(s, t) ∂s∂t s=0 s=0 ! ! d ∂ di−1σ(s, q(t)) ∂2τ = (t) − q(i)(t) (0, t) (3.30) dt ∂s dτ i−1(s, t) ∂s∂t s=0 dηi−1(q, t) ∂2τ = − q(i)(t) (0, t) = ηi(q, t). (3.31) dt ∂s∂t Substituting (3.30) into (3.26), we obtain the conclusion. 0 m ρ ∞ 0 ρ d Definition 49. Let T ⊆ [t1, t2] be a sharply open set. We say that a GSF C[q] (·) ∈ GC T , Re is a constant of motion on T 0, S0 if d C[q]m(t) = 0 ∀t ∈ T 0 (3.32) dt ρ ∞ 0 along all the weak extremals q ∈ GC ([t1, t2],S ). Note that Thm. 13 implies that condition (3.32) is equivalent to ask that C[q]m(·) is constant on any interval J ⊆ T 0. In fact, if t0 ∈ J, F (t) := C[q]m(t) − C[q]m(t0) is the unique GSF such that F (t0) = 0 and F 0(t) = 0 on any closed interval contained in T 0 (and hence on an arbitrary interval by sharp continuity). 22 GAST¨I¿ŒO S.F. FREDERICO, PAOLO GIORDANO, ALEXANDR A. BRYZGALOV, AND MATHEUS J. LAZO Theorem 50. If functional (3.3) is invariant in the sense of Def. 46, then the quantity C[q]m(t) ρ ∞ 0 0 defined for all q ∈ GC ([t1, t2],S ) and t ∈ T by m m ! X X ∂τ C[q]m(t) := ϕi(q, t) · ηi−1(q, t) + L[q]m(t) − ϕi(q, t) · q(i)(t) (0, t) (3.33) ∂s i=1 i=1 is a constant of motion on T 0 and S0. Proof. The proof follows directly from the previous results and no specific property of GSF is needed. For simplicity, we use the notations ϕi := ϕi(q, ·) and ηi := ηi(q, ·). m m ! X X ∂τ C[q]m(·) = ϕ1 · η0 + ϕi · ρi−1 + L[q]m − ϕi · q(i) (0, ·). (3.34) ∂s i=2 i=1 Differentiating (3.34) with respect to t, we obtain m d d d X d d C[q]m(t) = η0 · ϕ1 + ϕ1 · η0 + ηi−1 · ϕi + ϕi · ρi−1 dt dt dt dt 1 dt i=2 m ! ∂τ d X + (0, ·) L[q]m − ϕi · q(i) ∂s dt i=1 m ! X d ∂τ + L[q]m − ϕi · q(i) (0, ·). (3.35) dt ∂s i=1 Using the higher order Euler–Lagrange equation (3.14), the higher order du Bois–Reymond con- dition (3.17), relations (3.15) and (3.25) in (3.35), one gets d ∂σ d ∂τ C[q]m(t) = ∂ L[q]m · (0, q(·)) + ϕ1 · η1 +q ˙ (0, ·) dt 2 ∂s dt ∂s m X d ∂τ + ∂ L[q]m − ϕi−1 · ηi−1 + ϕi · ηi + q(i) (0, ·) i+1 dt ∂s i=2 m ! ∂τ X d ∂τ + ∂ L[q]m (0, ·) + L[q]m − ϕi · q(j) (0, ·) 1 ∂s dt ∂s i=1 ∂τ d ∂τ ∂σ = ∂ L[q]m (0, ·) + L[q]m (0, ·) + ∂ L[q]m · (0, q(·)) 1 ∂s dt ∂s 2 ∂s d ∂τ + ϕ1 · η1 +q ˙ (0, ·) − ϕ1 · η1 dt ∂s m d ∂τ X − ϕ1 · q˙ (0, ·) + ϕm · ηm + ∂ L[q]m · ηi−1 = 0. (3.36) dt ∂s i+1 i=2 4. Optimal control problems Thm. 50 gives a Lagrangian formulation of Noether’s principle extended to the generalized smooth functions setting. In this section, we adopt the Hamiltonian formalism in order to gener- alize Noether’s principle to the wider context of optimal control, see e.g. [13]. Let us consider the optimal control problem for GSF in Lagrange form, i.e. the minimization of the functional: t2 L (t, q(t), u(t)) dt −→ min (4.1) ˆt1 subject to the Cauchy problem CALCULUS OF VARIATIONS AND OPTIMAL CONTROL FOR GF 23 ( q˙(t) = ϕ (t, q(t), u(t)) , (CP) q(t1) = q1 . We explicitly note here that, because our generalized functions are set-theoretical maps, and because of the closure with respect to composition, the natural notion of solution for a differential equation with GSF is given by pointwise equality. On the other hand, Thm. 13 and Cor. 40 imply that, for GSF, the notion of pointwise solution is equivalent to that of weak solution. ρ d ρ In problem (4.1)–(CP), q1 ∈ Re , the Lagrangian L :[t1, t2] × H × K −→ Re, the state equation ρ d ϕ :[t1, t2] × H × K −→ Re , the state q :[t1, t2] −→ H and the control u :[t1, t2] −→ K are ρ d ρ l assumed to be GSF with respect to all the arguments, where H bf Re , K bf Re are functionally compact sets. Note that a particular case of this assumption is e.g. H = [−dρ−q, dρ−q]d ⊇ Rd because dρ−q is an infinite number. Since Sobolev-Schwartz distributions can be embedded as GSF (see Sec. 2.3), this is clearly more general than the usual traditional case, where state and control functions are assumed to be piecewise smooth. In particular, we will always consider state and control functions in the following spaces ρ ∞ q ∈ Q := {q ∈ GC ([t1, t2],H) | kq − q1k0 ≤ r} ρ ∞ u ∈ A := GC ([t1, t2],K), ρ where r ∈ Re>0 is a fixed generalized radius such that Br(q1) ⊆ H. In this section, we use simplified notations, e.g., of the form L(·, q, q˙) or ϕ(·, q, u) to denote compositions t ∈ [t1, t2] 7→ ρ ρ d L(t, q(t), q˙(t)) ∈ Re and t ∈ [t1, t2] 7→ ϕ(t, q(t), u(t)) ∈ Re . In developing this topic, it is therefore essential to already have suitable results about solutions of ODE with GSF. The Banach fixed point theorem can be easily generalized to spaces of gener- alized continuous functions with the sup-norm k − k0 (see Def. 29). As a consequence, we have the following Picard-Lindel¨ı¿œftheorem for ODE in the ρGCk setting, see [14, 47]. ρ ρ d ρ ρ k ρ d Theorem 51. Let t0 ∈ Re, y0 ∈ Re , α, r ∈ Re>0. Let F ∈ GC ([t0 − α, t0 + α] × Br(y0), Re ). ρ Set M := max |F (t, y)|, L := max |∂yF (t, y)| ∈ Re and assume that t0−α≤t≤t0+α t0−α≤t≤t0+α |y−y0|≤r |y−y0|≤r α · M ≤ r, lim αnLn = 0, (4.2) n→+∞ where the limit in (4.2) is clearly taken in the sharp topology. Then there exists a unique solution ρ k+1 ρ d y ∈ GC [t0 − α, t0 + α], Re of the Cauchy problem ( y0(t) = F (t, y(t)) (4.3) y(t0) = y0. This solution is given by n y = lim P (y0) n→+∞ t P (y)(t) : = y0 + F (s, y(s)) ds ∀t ∈ [t0 − α, t0 + α], ˆt0 n P+∞ αnLn and for all n ∈ N satisfies ky − P (y0)k0 ≤ αM k=n n! and ky − y0k0 ≤ r. Finally, we have the following Gr¨ı¿œnwall-Bellman inequality in integral form: ρ ρ k ρ Theorem 52. Let α ∈ Re>0. Let u, a, b ∈ GC [0, α], Re and assume that kak0 · α < N · −1 log dρ for some N ∈ N. Assume that a(t) ≥ 0 for all t ∈ [0, α], and that u(t) ≤ b(t) + t 0 a(s)u(s) ds. Then ´ 24 GAST¨I¿ŒO S.F. FREDERICO, PAOLO GIORDANO, ALEXANDR A. BRYZGALOV, AND MATHEUS J. LAZO (i) For every t ∈ [0, α] we have t t a(r) dr u(t) ≤ b(t) + a(s)b(s)e s ds; ´ ˆ0 (ii) If b(t) ≤ b(s) for all t ≤ s, i.e. if b is non-decreasing, then for every t ∈ [0, α] we have t a(s) ds u(t) ≤ b(t)e 0 ; ´ In this section, we always assume that the state equation ϕ satisfies the assumptions of Thm. 51 for each control u ∈ A, i.e. setting: ∀u ∈ A : Mu := max |ϕ(t, q, u(t))| t1≤t≤t2 |q−q1|≤r Lu := max |∂qϕ(t, q, u(t))| , (4.4) t1≤t≤t2 |q−q1|≤r we assume that ∀u ∈ A :(t2 − t1)Mu ≤ r n n lim (t2 − t1) Lu = 0. (4.5) n→+∞ Therefore, Thm. 51 allows us to state that u ∀u ∈ A ∃!q ∈ Q :(CP) holds for all t ∈ [t1, t2]. (4.6) Note explicitly, the importance in this deduction of the closure of GSF (of which, we recall, Sobolev-Schwartz distributions are a particular case) with respect to composition (Thm.9.(iv)). ρ Finally, observe that the constant Lu ∈ Re defined in (4.4) and Taylor Thm. 17 yield the Lipschitz condition ∀u ∈ A ∀t ∈ [t1, t2] ∀q, q˜ ∈ Br(q1): |ϕ(t, q, u(t)) − ϕ(t, q,˜ u(t))| ≤ Lu |q − q˜| . (4.7) 4.1. Weak Pontryagin Maximum Principle. Based on (4.6), we can introduce the notation and the optimization problem t2 ∀u ∈ A : I[u] := L (t, qu(t), u(t)) dt (4.8) ˆt1 ρ l Find v ∈ A :(CP) and ∃r ∈ Re>0 ∃l ∈ N ∀u ∈ A ∩ Br(v): I[v] ≤ I[u]. (4.9) In order to develop the necessary optimality condition for Problem (4.8), we first assume that ◦ ρ u ∈ A in the sharp topology of the Re-graded Fr¨ı¿œchet space (A, k − k0), see Sec. 2.4.1, i.e. ρ ∃r ∈ Re>0 : Br(u) = u¯ ∈ A | max |u¯(t) − u(t)| < r ⊆ A. t1≤t≤t2 Thereby, for all controlu ¯ ∈ A, and for some δ = δ(u, u¯) ∈ (0, 1) sufficiently small ∀h ∈ (−δ, δ): u + hu¯ ∈ A, (4.10) ρ and hence we can evaluate I[u + hu¯] ∈ Re. ρ To define the first variation of I[−]: A −→ Re at u ∈ A in the directionu ¯ ∈ A, we can intuitively think at a sort of first order Taylor sum of I[u + hu¯] at u: ρ ρ Definition 53. Let δ ∈ Re>0 be such that (4.10) holds. We say that R :(−δ, δ) −→ Re is an incremental ratio of I[−] at u ∈ A in the direction u¯ ∈ A, if there exists a function Rˆ :(−δ, δ) −→ ρ Re (called remainder) such that: (i) ∀h ∈ (−δ, δ): I[u + hu¯] = I[u] + h · R(h) + Rˆ(h) (ii) R is continuous at 0 in the sharp topology ρ ˆ 2 (iii) ∃A ∈ Re>0 ∀h ∈ (−δ, δ): R(h) ≤ A · h . ρ We first prove that R(0) ∈ Re is unique: CALCULUS OF VARIATIONS AND OPTIMAL CONTROL FOR GF 25 ◦ Theorem 54. Let u ∈ A and u¯ ∈ A. If R, S are incremental ratios of I[−] at u in the direction u¯, then R(0) = S(0). ρ ρ Proof. We have R :(−δ1, δ1) −→ Re and S :(−δ2, δ2) −→ Re from Def. 53. Set δ := min(δ1, δ2), so that δ > 0 by Lem.3. Let Rˆ and Sˆ be remainders of R, S respectively. Take an invertible h ∈ (−δ, δ); from Def. 53.(i), we get R(h)−S(h) = h−1· Rˆ(h) − Sˆ(h) . Therefore, |R(h) − S(h)| ≤ (A + Aˆ)h from Def. 53.(iii), and hence lim h→0 |R(h) − S(h)| = 0. But invertible elements h invertible are dense in the sharp topology (Lem.5), and thus lim h→0 |R(h) − S(h)| = 0 = R(0) − S(0) using Def. 53.(ii). ◦ Definition 55. We can therefore define the first variation of I[−] at u ∈ A in the direction u¯ ∈ A as δIu(u;u ¯) := R(0), where R is any incremental ratio of I[−]. Moreover, we say that u ∈ A is a weak extremal of I[−] if for anyu ¯ ∈ A, δI(u;u ¯) = 0. Using Def. 53 and similarly to Thm. 35, we can prove the following ◦ Theorem 56. If v ∈ A solves problem (4.9), then it is a weak extremal of I[−]. ρ l Proof. Letu ¯ ∈ A and let δ ∈ Re>0 be such that (4.10) holds and such that v + hu¯ ∈ Br(v) for all h ∈ (−δ, δ). From Def. 53.(i) of incremental ratio, we have I[v + hu¯] = I[v] + hR(h) + Rˆ(h) ≥ I[v]. Therefore, for all k ∈ (0, δ), we get kR(k) ≥ −Rˆ(k) and hence −R(k) ≤ Ak = A|k| by Def.53.(iii). Similarly, for all k ∈ (−δ, 0), we get R(k) ≤ −Ak = A|k|. Thereby, |R(k)| = max (R(k), −R(k)) ≤ A|k| for all invertible k ∈ (−δ, δ), which implies R(0) = δI(v;u ¯) = 0. ρ We will later prove that δI(u; ·) is an Re-linear continuous map. We first show the following continuity conditions for the map u ∈ A 7→ qu ∈ Q (see [9] for a similar proof): ◦ Theorem 57 (Stability of order 1 and 2). Let u ∈ A, u¯ ∈ A and δ ∈ (0, 1) as in (4.10). Assume that −1 ∀u, u¯ ∈ A ∃N ∈ N ∀h ∈ (−δ, δ): Lu+hu¯(t2 − t1) ≤ N log(dρ ). (4.11) ρ Then, there exist constants A, A¯ ∈ Re (depending only on u, u¯ and clearly on ϕ) such that for all | h |< δ, we have u+hu¯ u q − q 0 ≤ |h| A (4.12) u+hu¯ u ¯ 2 q − q − hq¯ 0 ≤ Ah , (4.13) ρ ∞ ρ d where q¯ ∈ GC ([t1, t2], Re ) is the unique global solution of the following Cauchy’s problem ∂ϕ ∂ϕ q¯˙ = (t, qu, u) · q¯ + (t, qu, u) · u¯ ∂q ∂u (LCP1) q¯(t1) = 0 Proof. We prove in detail (4.12); the proof of (4.13) is similar. Let h ∈ (−δ, δ). From the evolution ODE (CP), for all t ∈ [t1, t2], we have t u+hu¯ u u+hu¯ u q (t) − q (t) ≤ ϕ(s, q (s), u(s) + hu¯(s)) − ϕ(s, q (s), u(s)) ds ≤ ˆt1 t u+hu¯ u ≤ ϕ(s, q (s), u(s) + hu¯(s)) − ϕ(s, q (s), u(s)) ds (4.14) ˆt1 for all s ∈ [t1, t], we have u+hu¯ u ϕ s, q (s), u(s) + hu¯(s) − ϕ s, q (s), u(s) ≤ u+hu¯ u ≤ ϕ s, q (s), u(s) + hu¯(s) − ϕ s, q (s), u(s) + hu¯(s) + u u + ϕ s, q (s), u(s) + hu¯(s) − ϕ s, q (s), u(s) . (4.15) 26 GAST¨I¿ŒO S.F. FREDERICO, PAOLO GIORDANO, ALEXANDR A. BRYZGALOV, AND MATHEUS J. LAZO Now, we apply the Lipschitz condition (4.7) to the first summand, and a first order Taylor expan- sion with Lagrange remainder (Thm. 17.(i)) to the second one, obtaining u+hu¯ u u+hu¯ u ϕ s, q (s), u(s) + hu¯(s) − ϕ s, q (s), u(s) + hu¯(s) ≤ Lu+hu¯ · q (s) − q (s) (4.16)