arXiv:2005.14428v2 [math.FA] 15 Sep 2020 asne(PL,Sain1,C-05Luan,Switzerlan Lausanne, CH-1015 17, Station (EPFL), ( Lausanne 200020-162343/1. Grant under Foundation Science National htsol eteteise rtoe(hoe )that 2) (Theorem one first a issue: resul the classical we the settle formalism, of this should extensions on that Banach-space Based two 1). Schwart present by (Theorem extend supported the theorem III kernel Section of in form explain (distributional) first we t excludes operator, function identity measurable th a that be assumption should underlying response impulse the Since proof. chara together upgraded standard 1) new convolut (Proposition the filters a to BIBO-stable correction of of Section terization a definition present In integral then tools. classical We operator. the mathematical recall appropriate we of II, help the with distri of equations. theory differential the partial to and who tions contributions respectively 1962, Hörmander, fundamental and their Lars 1950 in for and medal Fields Schwartz the awarded Laurent were shoulde giants: the on two T rests of here. technical, clarify somewhat to is want which we argument, what is this not; Obviously stable? xlntosi pedxA n ec o ebrof member a (see not Lebesgue hence of and sense A) the L Appendix in function in measurable explanations a not is ic h bv ttmn xldsteiett operator, distribution identity Dirac the the excludes is response especially statement impulse references), above whose specific continuous the for the B in since confusion Appendix some (see discrete be domain for to straightforward seems fairly there sense pro systems, is the BIBO equivalence impulse While this its the summable/integrable. if of in only absolutely and stable is if is output) response bounded operator and Wikipedia of input convolution version (bounded a English the that in as is well as systems linear h pc fbuddRdnmaue,wihi uestof superset a is which measures, Radon bounded L of space the prtr hs mus epnei esrbefunction, convol measurable statement. the a classical to is the characterization response recover this impulse of whose scope operators the restrict we eso htalna hf-nain ytmi BIBO-stable is partic system In shift-invariant or fixes. linear n some a hypotheses) presenting this the that while of of show issue th purpose the we that lack The clarify operator. is to of identity troubling is the more ar (because exclude is textbooks usually vague What engineering incorrect. too in mathematically either found systems often convolution time [email protected] 1 1 imdclIaigGop cl oyehiu ééaede fédérale polytechnique École Group, Imaging Biomedical 1 h eerhlaigt hs eut a eevdfnigf funding received has results these to leading research The ntesqe,w hl eii h oi fBB stability BIBO of topic the revisit shall we sequel, the In ttmn hti aei otcusso h hoyof theory the on courses most in made is that statement A Abstract ( ( R https://en.wikipedia.org/wiki/BIBO_stability. R L ) ∞ ) osti enta h dniyoeao sntBIBO- not is operator identity the that mean this does , Lbsu’ pc fasltl nerbefntos.As functions). integrable absolutely of space (Lebesgue’s snei n nyi t mus epnei nlddin included is response impulse its if only and if -sense Tesaeet nteBB tblt fcontinuous- of stability BIBO the on statements —The .I I. NTRODUCTION ). oeo IOStability BIBO on Note A cesdNvme 2019 November Accessed o h Swiss the rom δ Since . ihe Unser Michael iha with -time ution ular, bu- ion ote we he he ed ey of z’ c- in rs . δ d e e 1 t , ctne,wopooe olmttefaeokt convoluti on to operating framework are the that operators limit to proposes who ichtinger, known with analysis. connection harmonic the make in also results prese we where are IV, derivations Section mathematical in filte The BIBO-stable generality. the full characterizes in an that continuous, 3) be convolution (Theorem the second of result the that imposes eege ee nta fdsgaigtecontinuous-tim the designating by of signal instead Here, Lebesgue. edvlptemteais npriua,ti eursth requires this particular, functions In mathematics. the the develop we saohritrsigpito iwta scmlmnayt IV. complementary Section is in that Thi discussed view [3]. as of pathologies ours, point avoid interesting to another order is in functions) bounded of ucinuulyseie by specified usually function ahmtclnotations mathematical ne h mlctasmto htteitga n()i wel is (1) in integral any the for that defined assumption implicit the under f ( nsalna hf-nain LI prtr(rsse)de system) (or operator by (LSI) shift-invariant linear a fines ecie as described oan eedn ntecnet h nu n output with and strict, input be the continuous can context, the requirements the in the on boundedness bounded. nontrivial makes Depending remains stability that domain: however, input BIBO aspect, of bounded mathematical formulation any one to is There response its that theorem, III-A. kernel Section Schwartz’ in by explained backed is interpretation This and distribution hog onws utpiainadtranslation. and multiplication pointwise through f eegessaeo one functions. bounded of space Lebesgue’s { seScinIIBfradtoa xlntos.Ti latt This as explanations). expressed additional often is for condition III-B Section (see f k Lebesgue: h − f f ( 2 t h ovlto ftofunctions two of convolution The elk omninasmlrcaicto fotb asFe- Hans by effort clarification similar a mention to like We nipratpatclrqieetfra S ytmis system LSI an for requirement practical important An fw fix we If I IOS BIBO II. scniuu (i.e., continuous is 1 k ∗ ) function A T ( sup : segnesuulyd,w r sn h esambiguous less the using are we do, usually engineers as , E f h R ) )( : fayBrlset Borel any of sup = △ t f ) → f or 7→ ( | h f t t f R ) ( h h t 7→ : n consider and f t h ∈ n t ovle o lee)vrinby version filtered) (or convolved its and TABILITY ) R ∗ R k ≤ | D ∗ t T = and s.t. ( f → | h f f ′ ∈ ( ∈ ( t mus response impulse Its . ∗ R R E t R h ) f L h ) ssi ob eegemaual ftepreimage the if Lebesgue-measurable be to said is f f in | hslte on ilb lrfidas clarified be will point latter This . f { )( ∞ cwrz pc fdsrbtos[17]. distributions of space Schwartz’ ohb measurable be both < δ T : t k smaual and measurable is R t ∈ } ( L ) R saBrlst[5.Tepoet spreserved is property The [15]. set Borel a is ∞ 7→ where , ∞ HE = f C △ ) hc rsswe h function the when arises which , . ( steiptsga,te 1 de- (1) then signal, input the as Z C f < R C R ( f 0 ) LASSICAL t ( h ,o nteloe es of sense looser the in or ), ) ∞ R ∈ ( τ or ) δ L ) awl-eae subclass well-behaved (a o lotany almost for f ∞ D ∈ ,f h, f ( ( t R ∈ − ) h k ′ F : 2 ( τ L f where ORMULATION R ste formally then is )d R ntesneof sense the in k p ) L ( τ R ∞ → steDirac the is ) k L f < and R k ∞ t L ∞} ( sthe is ∞ h noted R ∈ t ( nted = ) t a d (1) ) 7→ on as er = at rs R is o ∗ e s 1 l 2

Definition 1. The linear operator T: f 7→ T{f} is said to locally integrable, meaning that K |h(t)|dt < ∞ over any be BIBO-stable if compact domain K ⊂ R. TheR reassuring outcome, which 1) T{f} is well-defined for any f ∈ L (R), and, conforms with the practice in the field, is that one can 2) there exists a constant C > 0 independent∞ of f such that determine the stability of an LSI system by integrating the absolute value of its —even if h is not kT{f}kL∞ ≤ CkfkL∞ globally Lebesgue integrable, as in the case of an increasing for all f ∈ L (R). and possibly oscillating exponential. ∞ R The standard condition for the BIBO stability of the Proposition 1. If h ∈ L1( ), then the convolution operator defined by (1) is BIBO-stable with continuous-time convolution operator Th : f 7→ h ∗ f that f 7→ h ∗ f kh ∗ fkL∞ ≤ khkL1 kfkL∞ . Conversely, if the impulse response h is mea- is found in engineering textbooks is khkL1 < ∞, where the surable and locally integrable with , then L1-norm is defined by R |h(t)|dt = ∞ the system is not BIBO-stable, in whichR case it is said to be △ khkL1 = |h(t)|dt. (2) unstable. ZR Proof. The first statement is a paraphrasing of (3). For the con- A slightly more precise statement is h ∈ L1(R), where verse part, we assume that h ∈ L1,loc(R) with R |h(t)|dt = R R R L1( )= {f : → s.t. f is measurable and kfkL1 < ∞} ∞. Because of the local integrability of h, oneR can then still rely on the definition of the convolution given by (1), but only is Lebesgue’s space of absolutely integrable functions. if the input function f is bounded and compactly supported. The sufficiency of the condition h ∈ L1(R) is deduced from By considering the truncated versions f0,T = f0· ½[ T,T ] of the the standard estimate worst-case signal (4), we can therefore determine the− maximal h(τ)f(t − τ) dτ 6 |h(τ)| · |f(t − τ)| dτ value of the output signal as ZR ZR +T +T

6 (h ∗ f0,T )(0) = h(τ)sign h(τ) dτ = |h(τ)|dτ. |h(τ)| dτ kfkL∞ , Z T Z T ZR  −  − which is valid for any t ∈ R. The convolution integral (1) is The additional ingredient is the continuity of t 7→ (h∗f0,T )(t) therefore well defined if f ∈ L (R), which then also yields in the neighborhood of t = 0 (see Proposition 4 in Ap- ∞ the classical bound on BIBO stability pendix D), which allows us to conclude that (h ∗ f0,T )(0) ≤ supt R |h ∗ f0,T (t)| = kh ∗ f0,T kL∞ . While the latter quantity ∈ kh ∗ fkL∞ ≤khkL1 kfkL∞ < ∞. (3) is finite for any fixed value of T , we have that limT (h ∗ →∞ By adapting the argument that is used in the discrete-time f0,T )(0) = R |h(t)|dt = ∞, which indicates that the output formulation of BIBO stability, many authors (see Appendix signal becomesR unbounded in the limit. This shows that the underlying system is unstable. B) claim that the condition h ∈ L1(R) is also necessary. To that end, they apply the convolution system to a “worst-case” signal Another way of obtaining Proposition 1 is as a corollary of Theorem 3 (the complete characterization of BIBO-stable f0(t) = sign h(−t) (4)  systems) and Proposition 3 in Section IV. The important exam- in order to produce the strongest response at t =0, ples of unstable filters that fall within the scope of Proposition + + ∞ ∞ 1 are the systems ruled by differential equations with at (h ∗ f0)(0) = h(τ)sign h(τ) dτ = |h(τ)| dτ, least one pole in the right-half ; for instance, Z Z αt −∞  −∞ h(t)= ½+(t)e with Re(α) ≥ 0 [13]. The derivative operator which is then claimed to saturate the stability bound (3) with h = δ′ and the Hilbert transform with h(t)=1/(πt) are with kh ∗ f0kL∞ = khkL1 kf0kL∞ . Unfortunately, this simple unstable as well (as asserted by Theorem 3), but they fall reasoning has two shortcomings. First, unlike in the discrete outside the scope of Proposition 1: the first because δ′ is not setting, the characterization of what happens at t =0, which a function (but a distribution), and the second because the is a set of measure zero, is not sufficient to deduce that function 1/t is not locally integrable—in fact, the impulse kh ∗ f0kL∞ ≥ (h ∗ f0)(0), unless one invokes the continuity response of the Hilbert transform is the distribution “1/(πt)” of t 7→ (h ∗ f0)(t), which is not yet known at this stage (see that requires the use of a “principal value” for the proper Theorem 2). Second, one cannot ensure that the Lebesgue definition of the convolution integral [18]. convolution integral (1) is well defined for f ∈ L (R), 0 In the stable scenario, where h ∈ L1(R), we are able to 3 ∞ unless h is Lebesgue-integrable , which then considerably characterize the underlying filter by its frequency response limits the scope of the claim about necessity. △ jωt Our first practical fix is an extension of the argumentation h(ω) = F{h}(ω)= h(t)e− dt, (5) to the larger space L1,loc(R) of measurable functions that are ZR b which is the “classical” Fourier transform of h. Moreover, 3Any measurable function h : R → R admits a unique decomposition + − + − the Riemann-Lebesgue lemma ensures that h ∈ C (R) with as h = h − h with h ,h : R → R≥0. It is Lebesgue integrable if 0 + − min(kh kL , kh kL ) < ∞ [4]. khk ≤ khk . We recall that C (R) is the Banach space 1 1 sup L1 0 b b 3 of continuous and bounded functions that decay at infinity, Finally, we note that, for the cases where the Dirac impulse equipped with the sup-norm. δ is in the domain of the extended operator (for instance, when h ∈ C(R)), the distributional definition of the convolution III. BANACH FORMULATIONS OF BIBO STABILITY given by (6) yields Th{δ} = h ∗ δ = h, which explains the term “impulse response.” The classical textbook statements on continuous-time BIBO stability, including our reformulation in Proposition 1, have two limitations. First, they exclude the identity operator with B. Banach Spaces of Bounded Functions h = δ, as explained in Appendix A. Second, they are often In order to investigate the issue of BIBO stability, it is help- evasive concerning the hypotheses under which the condition ful to describe the boundedness and continuity properties of R h ∈ L1( ) is necessary (see Appendix B). In this section, we functions via their inclusion in appropriate Banach subspaces show how this can be corrected by considering appropriate of D′(R). The three relevant function spaces are Banach spaces. C0(R) ⊂ Cb(R) ⊂ L (R). ∞ A. Extension of the Notion of Convolution The central space consists of the subset of bounded functions The scope of our mathematical statements relies on that are continuous: Schwartz’ famous kernel theorem [5], [16] which delineates C (R)= {f : R → R s.t. f is continuous and kfk < ∞} . the complete class of linear operators that continuously map b sup R R R R D( ) → D′( ). We recall that D( )= Cc∞( ) is the space of It is a classical example of Banach space—a complete normed smooth and compactly supported test functions equipped with vector space [12]. The smaller space C0(R), which is also 4 R the usual Schwartz topology . Its topological dual D′( ) is equipped with the sup-norm, imposes the additional constraint the space of generalized functions also known as distributions. that f(t) should vanish at t = ±∞. It is best described as the R In essence, a distribution f ∈ D′( ) is a linear map—more completion of D(R) equipped with the sup-norm, which will precisely, a continuous linear functional—that assigns a real have its importance in the sequel. This property is indicated R number to each test function ϕ ∈ D( ); this is denoted by by C0(R) = (D(R), k·ksup). The concept is also valid for f : ϕ 7→ hf, ϕi. For instance, the definition of Dirac’s impulse R R R △ L1( ), which can be described as L1( ) = (D( ), k·kL1 ), as a distribution is δ : ϕ 7→ hδ, ϕi = ϕ(0). where the L1-norm is defined by (2) with f ∈ D(R) and Beside linearity, the property that defines an LSI operator the integral being classical—in the sense of Riemann. This is for any R. R R TLSI{ϕ(·− t0)}(t)=TLSI{ϕ}(t − t0) t0 ∈ completion property applies to Lp( )= (D( ), k·kLp ) with Schwartz’ theorem then tells us that there is a one-to-one p ∈ [1, ∞) as well [4, Proposition 8.17, p. 254], but not for R correspondence between continuous LSI operators D( ) → p = ∞, which explains the importance of the space C0(R), C(R) and distributions, with the defining distribution h ∈ which is distinct from L (R). R ∞ D′( ) being the impulse response of the operator. The relevant In order to properly identify L (R) as a subspace of D′(R), space of continuous functions here is C(R) with the topology we shall exploit the property that∞ the L -norm is the dual of ∞ of uniform convergence over compact sets, which involves the the L1-norm. We therefore choose to define the L -norm as N ∞ system of seminorms kfkN = sup t N |f(t)|,N ∈ . The △ | |≤ latter is an extended functional setup that tolerates arbitrary kfkL∞ = sup hf, ϕi = sup hf, ϕi, ϕ (R): ϕ L 1 ϕ L1(R): ϕ L 1 growth at infinity. ∈D k k 1 ≤ ∈ k k 1 ≤ (7) Theorem 1 (Schwartz’ kernel theorem for LSI operators). For where the central part of (7) takes advantage of the denseness5 any given h ∈ D′(R), the operator Th : ϕ 7→ h ∗ ϕ with of D(R) in L (R). This yields a definition that is valid not △ 1 t 7→ (h ∗ ϕ)(t) = hh, ϕ(t − ·)i (6) only for (measurable) functions, but also for all f ∈ D′(R). Consequently, we can redefine our target space as is LSI and continuously maps D(R) −→c. C(R). Conversely, for R c. R R R every LSI operator TLSI : D( ) −→ C( ), there is a unique L ( )= {f ∈ D′( ): kfkL∞ < ∞}, (8) ∞ h ∈ D (R) such that T = T : ϕ 7→ h ∗ ϕ where the ′ LSI h R convolution is specified by (6). which is readily identified as the topological dual of L1( ); that is, L (R) = L1(R) ′ due to the dual specification of ∞ Then, depending on the decay properties of h, it is generally the L -norm given by the right-hand side of (7). ∞ possible to extend the domain of the convolution operator Th While (8) defines L (R) as a subspace of D′(R), we can to some appropriate Banach space according to the procedure also identify its elements∞ as (bounded) measurable functions R described in Section IV. For instance, if h ∈ L1( ), then Th f : R → R via the classical association has a continuous extension L (R) → Cb(R) ⊂ C(R) that ∞ △ coincides with the classical definition given by (1). ϕ 7→ hf, ϕi = ϕ(t)f(t)dt, (9) ZR 4 A sequence of functions ϕk ∈ D(R) is said to converge to ϕ in D(R) 5 if (i) there exists a compact domain F that includes the support of ϕ and of This means that, for any f ∈ L1(R) and any ǫ > 0, there exists a △ n R all ϕk, and (ii) kϕk − ϕkn → 0 for all n ∈ N, where kϕkn = kD ϕkL∞ function ϕǫ ∈ D( ) such that kf − ϕǫkL1 < ǫ. It is a direct consequence n R R R R with D : D( ) →D( ) the nth derivative operator. of L1( )= (D( ), k · kL1 ). 4 where the right-hand side of (9) is a standard Lebesgue responses. We shall delineate the latter in a way that parallels integral. Now, the main difference between the sup-norm our definition of L (R), with the roles of the L1- and sup- (or and the L -norm is that, for f ∈ L (R) (identified as a L -) norms being∞ interchanged. To that end, we first define ∞ ∞ ∞ function), the inequality |f(t)|≤kfkL∞ holds for almost the M-norm as every t ∈ R. This means that it holds over the whole △ kfk = sup hf, ϕi. (10) real line except, possibly, on a set of measure zero. This M ϕ (R): ϕ sup 1 ∈D k k ≤ is often indicated as kfk = esssupt R |f(t)|, using the notion of essential supremum.∞ In other words,∈ the L -norm This then yields the Banach space is more permissive than the -norm with ∞ . sup kfk ≤kfksup M(R)= {f ∈ D′(R): kfk < ∞}, (11) However, the two norms are equal whenever the∞ function f M is continuous, which translates into the isometric inclusion which also happens to be the space of bounded Radon mea- iso. iso. 6 R R C0(R) ֒−→ Cb(R) ֒−→ L (R). sures on C0( ). In other words, M( ) is the topological dual ∞ of C0(R). Moreover, we can invoke the Riesz-Markov theorem to identify M(R) = C (R) ′ with the space of bounded C. Extended Results on BIBO Stability 0 signed Borel measures on R [15]. Concretely, this means that Remarkably, the combination of the two latter function any h ∈M(R) is associated with a unique Borel measure µh, spaces enables us to formulate a first Banach extension of the which then gives a concrete definition of the linear functional classical statement on BIBO stability. To that end, we shall △ restrict the distributional framework covered by Theorem 1 f 7→ hh,fi = f(τ)dµh(τ) (12) to the case where the impulse response h is identifiable as a ZR measurable function i.e., h ∈ L (R) ⊂ D′(R) . The linear for any measurable function f, while the total-variation norm 1,loc △ functional on the right-hand side of (6) then has an explicit of the measure µh is given by kµhkTV = R d|µh| = khk M integral description given by (1) with f = ϕ ∈ D(R). Within (see Section IV-C). The main point for us isR that M(R) is a R R this class of convolution operators, we now identify the ones superset of L1( ), with kfk = kfkL1 for all f ∈ L1( ). M whose domain can be extended to L (R). Moreover, we have that δ(·− t0) ∈ M(R) for any t0 ∈ R ∞ with kδ(·− t0)k = 1, as can be readily inferred from (10) Theorem 2. The convolution operator Th : f 7→ h ∗ f with M c. by considering a non-negative test function that achieves its h ∈ L1,loc(R) has a continuous extension L (R) −→ Cb(R) ∞ maximum ϕ(t )=1 at t = t . if and only if h ∈ L (R). Moreover, 0 0 1 For the cases where the impulse response h ∈ M(R) is

kh ∗ fksup ≤khkL1 kfkL∞ not an L1 function, we extend our definition of the original (Lebesgue) convolution integral as with the bound being sharp in the sense that it also yields the △ norm of the underlying operator: kThkL∞ Cb = khkL1 (see t 7→ (h ∗ f)(t)= hh,f(t − ·)i = f(t − τ)dµh(τ), (13) Definition 2). → ZR The proof of this result is deferred to Section IV see Item which is the same as (1) when we can write dµh(τ)= h(τ)dτ, which happens when the corresponding measure µh is abso- 2) and the final statement of Theorem 4. 7 It is of interest to compare Proposition 1 and Theorem 2 lutely continuous with respect to the Lebesgue measure. A because they address the problem of stability from different standard manipulation then yields that but complementary perspectives. Proposition 1 is focused pri- |(h ∗ f)(t)|≤ |f(t − τ)| d|µh|(τ) marily on the well-posedness of the convolution integral (1) for ZR f ∈ L (R). It can be paraphrased as: “Let h be a measurable ∞ ≤kfkL∞ d|µh| = kfkL∞khk , (14) (and locally integrable) function. Then, the Lebesgue integral ZR M (1) defines a convolution operator that is BIBO-stable if and which is the basis for the direct (easy) part of Theorem 3, only if h ∈ L (R).” By contrast, Theorem 2 considers the 1 where the complete class of BIBO-stable systems is identified, complete family of “classical” convolution operators T : h including the identity operator. D(R) → C(R) with h ∈ L1,loc(R) and precisely identifies the subset of operators that have a continuous extension from Theorem 3. The convolution operator Th : f 7→ h ∗ f with R R R c. L ( ) → Cb( ). Since Cb( ) is isometrically embedded in h ∈ D′(R) has a continuous extension L (R) −→ L (R) if ∞ L (R), this is more informative than Proposition 1 because it and only if h ∈M(R). Moreover, ∞ ∞ also∞ tells us that (h∗f)(t) is a continuous function of t ∈ R. In kh ∗ fk ≤khk kfkL∞ that respect, we note that the requirement that the convolution ∞ M of any bounded function f be continuous excludes the use of with the bound being sharp in the sense that kThkL∞ L∞ = the identity operator with h = δ at the onset, even if we extend khk . → M the framework to h ∈ D′(R). To obtain a more permissive characterization of BIBO 6We adhere with Bourbaki’s nomenclature to distinguish the two comple- stability, we need to extend the range of the operator from mentary interpretations of a measure: either as a continuous linear functional on D(R) (Radon measure), or as a set-theoretic additive rule that associates Cb(R) to L (R), which should then also translate into a a real number to any Borel set of R (signed Borel measure) [2], [1]. ∞ 7 corresponding enlargement of the class of admissible impulse Another way to put it is that h is the Radon-Nikodym derivative of µh. 5

This result, which is also valid in dimensions higher than 1, for all ϕ ∈ D(R) and some constant C > 0. is known in harmonic analysis [6, p. 140 Corollary 2.5.9], [18] c. Since a convolution operator T : D(R) −→ D (R) is but much less so in engineering circles. It can be traced back h ′ uniquely characterized by its impulse response h ∈ D′(R), to an early paper by Hörmander that provides a comprehensive c. the same holds true for its extension T : X −→Y, which treatment of convolution operators on L spaces [7]. The h p justifies the use of the same symbol. Rather than defining reminder of the paper is devoted to the proof of the two T {f} = h ∗ f through a Lebesgue integral as in (1) or theorems on BIBO stability and of some interesting variants h (13), we can therefore rely on (6) and define our extended (see Theorem 4). To that end, we shall rely on Schwartz’ convolution operator T : X → Y through a limit process. powerful distributional formalism which, as we shall see, h Specifically, we pick a Cauchy sequence (ϕn) in (D(R), k·k ) allows for a rather soft derivation, once the prerequisites have X such that limn ϕn = f ∈ X . Then, the sequence of been laid out. →∞ functions (gn = h ∗ ϕn) with IV. MATHEMATICAL DERIVATIONS t 7→ (h ∗ ϕn)(t)= hh, ϕn(t − ·)i (17) A. Extension of Convolution Operators is Cauchy in Y and converges to a limit g = limn (h ∗ The most general form of a convolution operator backed by ϕ ) ∈ Y, independently of the choice of the ϕ since→∞ the R n n Schwartz’ kernel theorem (see Theorem 1) is Th : D( ) → space Y is complete. We now recapitulate this process in the R R R .C( ) ֒−→D′( ) with h ∈ D′( ), where Th{ϕ} is defined by form of a definition (6) for any ϕ ∈ D(R). The two complementary ingredients at play there are: (i) the restriction of the domain to D(R)— Definition 3 (Banach extension of a distributional convolution the “nicest” class of functions in terms of smoothness and operator). Let X and Y be two Banach subspaces of D′(R) R decay—and (ii) the extension of the range to D′(R), which with the additional property that D( ) is dense in X . When can accommodate an arbitrary degree of growth (polynomial, the two conditions in Theorem 2 hold, the unique continuous c. or even exponential) at infinity. In other words, the theoretical extension Th : X −→Y of the convolution operator specified framework is such that it can deal with the very worst sce- by (17) with h ∈ D′(R) is defined by narios, including unstable differential systems whose impulse △ Th : f 7→ h ∗ f = lim (h ∗ ϕn) ∈ Y, (18) response is exponentially increasing. n →∞ Then, depending on the smoothness and decay properties where (ϕn) is any sequence in D(R) such that limn kf − →∞ of h, it is usually possible to extend the domain of Th to ϕnk =0. some Banach space X ⊇ D(R) that is continuously embedded X : ∗in D′(R), which is denoted by X ֒−→D′(R). For this to be Also important for our purpose is the adjoint operator T feasible, we require that k·k be a valid norm on D(R) Y′ → X ′, which is the unique linear operator such that and that D(R) be dense in X ,X which is equivalent to X = hg, T{f}i ′ = hT∗{g},fi ′ (D(R), k·k ). In other words, X is the completion of D(R) Y ×Y X ×X X equipped with the k·k -norm. for any g ∈ Y′ and f ∈ X ′, where the spaces X ′ and X We start by recalling the definition of the norm of a bounded Y′ are the duals of the topological vector spaces X and operator. Y, respectively. If T: X −→c. Y is bounded with operator norm kTk, then the adjoint T : Y −→Xc. is bounded with Definition 2. Let (X , k·k ) and (Y, k·k ) be two Banach ∗ ′ ′ X Y kT∗k = kTk. In particular, the adjoint of the convolution spaces and T a linear operator X → Y. Then, the operator c. c. operator T : D(R) −→D (R) is T ∨ : D(R) −→D (R), is said to be bounded if h ′ h ′ where h∨ is the time-reversed impulse response such that △ △ kT{f}k kTk = sup Y < ∞. hh∨, ϕi = hh, ϕ∨i, where ϕ∨(t) = ϕ(−t). X →Y f 0 kfk We now briefly show how we make use of these two mech- ∈X \{ } X anisms to specify the continuous extension Th : L (R) → A direct consequence of Definition 2 is that a bounded ∞ L (R) with h ∈ M(R) (or, h ∈ L1(R)) that is implicitly operator T: X → Y continuously maps X into Y, as indicated ∞ by T: X −→Yc. . referred to in Theorems 2 and 3. The enabling ingredient there Theorem 2 then describes a functional mechanism that is the continuity bound kh∨∗ϕkL1 ≤kh∨k kϕkL1 (see proof M R allows us to extend an operator initially defined on D(R). of Theorem 4, Item 4), which also yields h∨ ∗ ϕ ∈ L1( ) for all ϕ ∈ D(R). We then apply Definition 3 to specify the unique It is a particularization of a fundamental extension theorem in c. ∨ R R the theory of Banach spaces [14, Theorem I.7, p. 9]. extension Th : L1( ) −→ L1( ). An important point for our argumentation is that this (pre-adjoint) convolution operator Proposition 2 (Extension of a linear operator). Let X and also has a concrete implementation as Y be two Banach subspaces of D′(R) with the additional R property that D( ) is dense in X . Then, the linear operator t 7→ (h∨ ∗ ϕ)(t)= hh∨, ϕ(t − ·)i = ϕ(τ + t)dµh(τ), c. R T: D(R) −→ D′(R) has a unique continuous extension Z c. (19) X = (D(R), k·k ) −→Y with kTk ≤ C if and only if X X →Y which is supported by the same continuity bound with ϕ now (i) T{ϕ} ∈ Y, and (15) ranging over L1(R) instead of the smaller space D(R). The c. (ii) kT{ϕ}k ≤ Ckϕk (16) ∨ R R Y X existence and uniqueness of Th : L1( ) −→ L1( ) then 6

Rd guarantees the existence and unicity of the adjoint Th∗∨ : Due to the constraining topology of D( ), lim∆t 0 kϕ − R c. R → L ( ) −→ L ( ). To show that Th∗∨ = Th, we use the ϕ(·− ∆t)kLp =0 for any p ≥ 1, which proves the continuity ∞ ∞ explicit representation of Th∨ given by (19) with h∨ ∈M(R) of the function t 7→ (h ∗ ϕ)(t). This leads to the intermediate and invoke Fubini’s theorem to justify the interchange of outcome h ∗ ϕ ∈ Cb(R) for all ϕ ∈ D(R). integrals in If we now replace h by φ ∈ D(R), we readily deduce that Tφ{ϕ} = φ ∗ ϕ is compactly supported; hence, Tφ{ϕ} ∈ hT ∨ {f},gi = f(τ + t)dµ (τ) g(t)dt R R h h C0( ) for all ϕ ∈ D( ) with kφ ∗ ϕkL∞ ≤ kφkL kϕkL . ZR ZR  q p We then invoke Theorem 2 with X = (D(R), k·kp) to c. = f(x)g(x − τ)dµh(τ)dx deduce that Tφ : Lp(R) −→ C0(R) for p ∈ [1, ∞) and ZR ZR c. Tφ : C0(R) −→ C0(R) for any φ ∈ D(R). Since the convolu- R = f(x) g(x − τ)dµh(τ) dx tion is commutative, this implies that φ ∗ h = h ∗ φ ∈ C0( ) ZR ZR  for any h ∈ Lq(R) with q ∈ (1, ∞) resp., h ∈ C0(R) and = hf, Th{g}i φ ∈ D(R) ⊂ Lp(R) which, by completion with respect to the k·k norm with p ∈ (1, ∞), gives T : L (R) −→c. C (R) for any f ∈ L1(R) and g ∈ L (R). This proves that the Lp h p 0 ∞ R c. R with kThkLp C0 ≤ khkLq (resp., Th : C0( ) −→ C0( ) original convolution operator defined by (13) coincides with → c. with kT k = khk ). the adjoint of Th∨ : L1(R) −→ L1(R), which is also consistent h C0 C0 L1 → R iso. R with the property h = (h∨)∨. Since D(R) ⊂ L (R), we can Item 3. Since Cb( ) ֒−→ L ( ), the relevant duality bound c. ∞ ∞ therefore uniquely identify Th : L (R) −→ L (R) as the there is (14), which yields kh ∗ ϕkL∞ ≤khk kϕkL∞ . This ∞ ∞ M extension of Th : D(R) → D′(R) that preserves the adjoint allows us to use the same argument as in Item 1 to show that T {ϕ} ∈ C (R) for all ϕ ∈ D(R). Since C (R) = relation Th∗∨ =Th. h b 0 Rd (D( ), k·kL∞ ), we then apply the proven completion tech- B. Proof of Banach Variants of BIBO Stability nique to specify the unique operator Th : C0(R) → Cb(R) R c. The Banach spaces of interest for us are X = C0(R),Lp(R) with kThkC0 Cb ≤ khk . Conversely, let Th : C0( ) −→ R R R R → M and Y = C0( ), Cb( ),Lp( ) with p ≥ 1. Cb( ) with operator norm kThkC0 Cb < ∞. Then, for any ϕ ∈ C (R), → Theorem 4. Depending on the functional properties of its 0 R impulse response h ∈ D′( ), the convolution operator Th : (h ∗ ϕ)(0) = hh, ϕ∨i≤kThkC0 Cb kϕkL∞ → D(R) −→Dc. (R) defined by (6) admits the following (unique) ′ with ϕ ∈ C (R) and kϕ k = kϕk . By substituting ϕ continuous extensions8 ∨ 0 ∨ L∞ L∞ for ϕ∨ and by recalling that D(R) is dense in C0(R), we get 1) Let p, q ∈ (1, ∞) be conjugate exponents with 1 + 1 =1. p q that R R c. R Then, h ∈ Lq( ) ⇒ Th : Lp( ) −→ C0( ) with hh, ϕi hh, ϕi kThkLp C0 ≤khkLq . sup = sup → ϕ C0(R) 0 kϕkL∞ ϕ (R) 0 kϕkL∞ c. ∈ \{ } ∈D \{ } 2) h ∈ L1(R) ⇒ Th : L (R) −→ Cb(R) with ∞ = khk ≤kThkC0 Cb , (20) kThkL∞ Cb = khkL1 . M → → c. which then also proves that the bound is sharp. 3) h ∈M(R) ⇔ Th : C0(R) −→ Cb(R). c. Item 4. The key here is the estimate 4) h ∈M(R) ⇔ Th : L1(R) −→ L1(R). c. 5) h ∈M(R) ⇔ Th : L (R) −→ L (R). (h ∗ f)(t) dt ≤ |f(t − τ)| d|µh|(τ) dt ∞ ∞ ZR ZR ZR Moreover, the operator norms for Items 3-5, characterized by an equivalence relation, are kThkC0 Cb = kThkL1 L1 = = |f(x)|dx d|µh|(τ) → → ZR ZR  kThkL∞ L∞ = khk . Finally, under the hypothesis of local → M c. (by Fubini) integrability h ∈ L1,loc(R), the continuity of Th : L (R) −→ ∞ Cb(R) implies that h ∈ L1(R), which is the converse part of = |f(x)|dx d|µh| Item 2. ZR  ZR 

= kfkL1 khk , Proof: M R Item 1. Under the assumption that h ∈ Lq( ) with q ≥ 1, from which we deduce the boundedness of Th : L1(R) → we invoke Hölder’s inequality R L1( ) with kThkL1 L1 ≤ khk . (The extension technique → M is essentially the same as in Item 1 with p =1 and L1(R)= |(h ∗ ϕ)(t)|≤ |h(τ)| · |ϕ(t − τ)|dτ ≤khkLq kϕ(·− t)kLp R ZR (D( ), k·kL1 ).) The converse implication and the sharpness for any ϕ ∈ D(R), which yields the required upper bound of the bound will be deduced from Item 5 by duality. Item 5. Since L (R) = L1(R) ′, the adjoint of Th : kh ∗ ϕkL∞ ≤khkLq kϕkLp . Likewise, by linearity, we get that c. ∞ c. R R ∨ R R L1( ) −→ L1( ) is Th∗ = Th : L ( ) −→ L ( ). The ∞ ∞ |(h ∗ ϕ)(t) − (h ∗ ϕ)(t − ∆t)| = h ∗ ϕ(t) − ϕ(t − ∆t) equivalence h ∈M(R) ⇔ h∨ ∈M(R) implies the continuity  R c. R ≤khkLq ·kϕ − ϕ(·− ∆t)kLp . of Th : L ( ) −→ L ( ) with kThkL∞ L∞ ≤ khk = ∞ ∞ → M kh∨k . As for the converse implication, we take advantage 8See Definition 3 and accompanying explanations. The bottom line is that M iso. of the embedding C0(R) ֒−→ L (R), which allows us to reuse the definition of these operators is compatible with the convolution integral ∞ (1) or (13) depending on whether h is a function or a Radon measure. the argument of Item 3. 7

Item 2 and Its Converse. The first part follows from the C. Explicit Criterion for BIBO Stability beginning of the proof of Item 1, the application of the We now show how to determine khk (our extended extension principle for p = 1, and the commutativity of the criterion for BIBO stability) under the assumptionM that h ∈ R convolution integral, which yields f ∗h = h∗f ∈ Cb( ) with L (R). Any such impulse response can be identified with R 1,loc kh ∗ fkL∞ ≤ khkL1 kfkL∞ for any f ∈ L ( ). We show a distribution by considering the linear form that the bound is sharp by applying the convolution∞ operator to the “worst-case” signal f0 identified in (4). Conversely, let h : ϕ 7→ hh, ϕi = h(t)ϕ(t)dt, (22) R c. R ZR Th : L ( ) −→ Cb( ) with kThkL∞ Cb < ∞. Taking ∞ → iso. R R advantage of the isometric embedding Cb(R) ֒−→ L (R), which continuously maps D( ) → . It turns out that the we then invoke the equivalence in Item 5 to deduce∞ that latter is a special instance of a real-valued Radon measure, khk < ∞, which implies that h ∈ M(R). The announced which is an extended type of measure whose k·k -norm is M equivalenceM then follows from Proposition 3 in Section IV-C. not necessarily finite.

Definition 4 ([17]). A distribution f ∈ D′(R) is called a real- The result in Item 1 is discussed in most advanced treatises valued Radon measure if, for any compact subset K ⊂ R, there on the Fourier transform (e.g., [4, Proposition 8.8, p 241]). exists a constant CK > 0 such that We are including it here in a self-contained form—at the expense of a few more lines in the proof of Item 2—because hf, ϕi≤ CK sup |ϕ(t)| (23) t K it nicely characterizes the regularization effect of convolution. ∈ The equivalences stated in Item 4 and Item 5 are known for all ϕ ∈ D(K)= ϕ ∈ D(R): ϕ(t)=0, ∀t∈ / K . + in the context of the theory of Lp Fourier multipliers [6, A distribution f  ∈ D′(R) is said to be positive if Section 2.5], even though the latter does not seem to have hf +, ϕi ≥ 0 for all ϕ ∈ D+(R) = ϕ ∈ D(R): ϕ(t) ≥ permeated to the engineering literature. The equivalence in 0,t ∈ R .  Item 4 may also be identified as a special instance of Wendel’s The connection between the two kinds of distributions in theorem in the abstract theory of multipliers on locally com- Definition 4 is that a positive distribution is a special instance pact Abelian groups [11, Theorem 0.1.1, p. 2]. Interestingly, R of a Radon measure, while any real-valued Radon measure f the condition h ∈ M( ) is also sufficient for the continuity + c. admits a unique decomposition as f = (f −f ), where both of T : L (R) −→ L (R), a claim that is supported by the − h p p f +,f ≥ 0 are positive distributions [19, Theorem 21.2, p. Young-type norm inequality − 218]. One then also defines the corresponding “total-variation + measure” |f| = f + f −, which is positive by construction. kh ∗ fkLp ≤khk kfkLp , (21) M It turns out that the Dirac impulse δ is a positive Radon K which holds for any f ∈ Lp(R) with p ≥ 1. However, measure with a universal bounding constant C = 1. Like- R (21) is only sharp at the two end points p = 1, +∞, in wise, the minimal constant in (23) for f ∈ L1,loc( ) is C conformity with the statements in Items 4 and 5. In fact, K = K |f(t)|dt, which is essentially what is expressed in the only other case where the complete class of convolution PropositionR 3. R c. R operators Th : Lp( ) −→ Lp( ) has been characterized is Proposition 3 (Total-variation norm for measurable functions). for p = 2, with the necessary and sufficient condition being R + Let h ∈ L1,loc( ). Then, khk = khkL1 = ∞ |h(t)|dt. h ∈ L (R) (bounded frequency response) [18, Theorem M + −∞ ∞ Consequently, h ∈M(R) if and only if ∞ |hR(t)|dt< ∞. 3.18, p. 28], which is slightly more permissive than the BIBO b R−∞ requirement. Indeed, h ∈M(R) ⇒ h ∈ L (R), whereas the Proof: In accordance with Definition 4, we view h ∈ reverse implication does not hold. ∞ L (R) as a real-valued Radon measure with h+(t) = b 1,loc We like to single out Item 3 in Theorem 4 as the pivot max h(t), 0 and h−(t) = max 0, −h(t) , while the corre- + point that facilitates the derivation of the (nontrivial) reverse sponding total-variation measure is |h| = h +h− ∈ L1,loc(R) implications—namely, the necessity of the condition h ∈ with |h| : t 7→ |h(t)|, which is consistent with the notation. M(R). While the listed property is sufficient for our purpose, We then distinguish between two cases. we can refer to a recent characterization by Feichtinger [3, (i) Bounded Scenario. When khk < ∞, we can invoke M Theorem 2, p. 499] which, in the present context, translates the classical Jordan decomposition of a measure (see [4]), c. into the refined statement “h ∈ M(R) ⇔ Th : C0(R) −→ + ∀f ∈M(R): kfk = kf k + kf −k = k|f|k < ∞, C0(R).” The additional element there is the vanishing of M M M M (h ∗ f)(t) at infinity, which calls for a more involved proof. which allows us to reduce the problem to the easier determina- While the statement in Item 2 is a special case of Item 5, tion of k|h|k . Accordingly, for any given T > 0, we define M as made explicit in Section IV-C, the interesting part of the hT = |h| · ½[ T,T ] ≥ 0 and observe that story is that this restriction induces a smoothing effect on the − khT k = sup hhT , ϕi = hhT , 1i output, ensuring that the function t 7→ (h∗f)(t) is continuous. M ϕ (R): ϕ L∞ 1 There is obviously no such effect for the case h = δ ∈M(R) ∈D k k ≤ = kh k < ∞, R T L1 (identity) or, by extension, hd = n Z a[n]δ(·− n) ∈M( ) ∈ with khdk = kakℓ1 , which correspondsP to the continuous- where the supremum is achieved by considering any test func- M time transposition of a digital filter. tion ϕT ∈ D(R) such that ϕT (t)=1 for all t ∈ [−T,T ]. In 8

the limit, we get that limT khT kL1 = limT khT k = From the pragmatic point of view of an engineer, the title k|h|k < ∞, from which→∞ we conclude that→∞k|h|k M = question is at the heart of the matter to understand the scope M M khk = khkL1 . of Proposition 1, and the source of some confusion, too. Let us M (ii) Unbounded Scenario. The condition khk = ∞ can start by listing the elements that could suggest that the answer M be formalized as: for any n ∈ N, there exists ϕn ∈ D(R) with to the question is positive. kϕnkL∞ =1 such that hh, ϕni >n. However, 1) It is common practice to make liberal use of what mathematicians consider abusive notations; in particular, hh, ϕni≤ h(t)ϕn(t)dt ≤ |h(t)|dt kϕnkL∞ = khkL1 . equations such as f(t)= δ(τ)f(t−τ)dτ, which could ZR ZR R suggest that δ(τ) can beR manipulated as if it were a N Therefore, khkL1 >n for all n ∈ , leading to khkL1 = ∞. classical function of τ. 2) Dirac’s δ has the unit “integral” hδ, 1i = 1, which is Let us now conclude with a few more observations. indicated formally as δ(τ)dτ =1. Moreover, δ ≥ 0 in R R Since L1,loc( ) can be identified as the subspace of mea- the sense that it is a positiveR distribution (see Definition sures that are absolutely continuous (see [17, p. 18]), the result 4). in Proposition 3 is consistent with the well-known property 3) The Dirac impulse is often described as the limit of in probability theory that R coincides with the subset of n (tn)2/2 L1( ) ϕn(t) = e− as n → ∞, with ϕn ∈ S(R). bounded measures that are absolutely continuous. √2π Since kϕnkL1 =1 for any n> 0, this could suggest that Under the minimalistic assumption that h ∈ L1,loc(R), the limn kϕnkL1 =1 as well. convolution integral (1) is well defined for any t ∈ R provided →∞ In order to convince the reader that the answer to the title that the input function f : R → R is bounded and compactly question is actually negative, we now refute these intuitive supported. Equation (1) then even yields an output function arguments one by one. t 7→ (h ∗ f)(t) that is continuous, as shown in Appendix D. However, the trouble comes from the fact that the output then 1) The explicit description of the Dirac impulse as a centered Gaussian distribution whose standard deviation σn =1/n inherits the potential lack of decay of h when h∈ / L1(R). tends to zero suggests that δ = limn ϕn must be One can also make a connection between the result in →∞ Proposition 3 and the standard argument that is presented to entirely localized at t =0. The best attempt at describing this limit in Lebesgue’s world of measurable functions justify the necessity of h ∈ L1(R). When the latter condition is fulfilled, we have that would be +∞, t =0 khk = sup hh, ϕi p0(t)= M  0, otherwise, ϕ (R): ϕ L∞ 1 ∈D k k ≤ which is equal to zero almost everywhere. However, = khk = sup hh, φi = h(t)φ (t)dt, L1 0 since the width of the impulse is zero, we get that φ L∞(R): φ L∞ 1 ZR ∈ k k ≤ (24) R p0(t)dt =0, which is incompatible with the property thatR R δ(t)dt =1. This points to the impossibility of rep- where φ0(t) = sign h(t) . While the supremum is achieved resentingR δ by a function in L1(R) or even in L1,loc(R). R exactly over L ( ) by taking φ = φ0, it is a bit trickier Strictly speaking, δ is defined as a continuous linear R ∞ over D( ) because of the additional smoothness requirement. functional on D(R)—or, by extension, C0(R)—which Yet, due to the definition of the supremum, for any ǫ > 0 precludes the application of any nonlinear operation (such R p there exists a function ϕǫ ∈ D( ) with kϕǫk =1 such that as | · | ) to it. ∞ h(t)ϕǫ(t)dt = (1 − ǫ)khk ≤ khk = khkL1 . By taking 2) The generalized Fourier transform of δ is F{δ} = 1, M M Rǫ arbitrarily small, we end up with ϕǫ being a “smoothed” which is bounded, but not decreasing at infinity. If δ was rendition of φ0, so that the spirit of the initial argument is included in L1(R), this would contradict the Riemann- c. retained. Lebesgue Lemma, which is equivalent to F : L1(R) −→ R C0( ) with kFkL1 C0 = 1. By contrast, the inclusion → APPENDIX δ ∈M(R) is compatible with the (dual) continuity prop- R c. R R erty of the Fourier transform F ∗, F : M( ) −→ L ( ) A. Is the Dirac Distribution a Member of L1( )? ∞ with kF ∗k L0 =1. Let us start with the historical observation that the epony- M→ 3) While the sequence of rescaled Gaussians (ϕn) converges mous impulse δ is already present in the (early) works of both (to δ ∈ S′(R) ֒−→D′(R) in the (weak) topology of S′(R Fourier and Heaviside [10]. The former, as one would expect, (Schwartz’ space of tempered distributions), the problem defined it via an “improper” integral (the inverse Fourier is that it fails to be a Cauchy sequence in the (strong) transform of “1”), while the latter identified δ as the “formal” norm topology of L1(R). Hence, there is no guarantee derivative of the unit step (a.k.a. the Heaviside function). that δ = limn ϕn stays in L1(R). However, the mathematics for giving a rigorous sense to these →∞ identifications were missing at the time; one had to wait for the development Schwartz’ distribution theory in the 1950s B. Examples of Inaccurate Statements on BIBO Stability [17], which already shows that the mere process of obtaining This list is far from exhaustive and not intended to downplay a rigorous definition of δ was far from trivial. the important contributions of the listed people who are 9 internationally recognized leaders in the field. Its sole purpose where M is the smallest symmetric interval such that f(t)= is to illustrate the omnipresence of the misconception in the f(−t)=0 for all t∈ / M. For any given open bounded set engineering literature, including in some of the most popular K ⊂ R, we then observe that and authoritative textbooks in the theory of linear systems and ∀t ∈ K : (h ∗ f)(t) = (hK M ∗ f)(t) . +

As a start, one can read in the English version of Wikipedia where hK+M = h · ½K+M is the restriction of the original that a necessary and sufficient condition for the BIBO stability impulse response to the set K + M = {t + τ : t ∈ K, τ ∈ M}. of a convolution operator is that its impulse response be ab- Since hK+M ∈ L1(R), one has that solutely integrable, formulated as R |h(τ)|dτ = khkL1 < ∞. sup |h ∗ f(t)|≤kfkL∞ khK+MkL1 < ∞. (25) In view of the discussion aroundR Proposition 1, this is only t K correct if one restricts the scope of the statement to those ∈ Likewise, for any t,t ∈ K, we have that impulse responses that are Lebesgue-measurable and locally 0 integrable. |h ∗ f(t) − h ∗ f(t0)| Kailath mentions in [9, p. 175] that the equivalence between ≤kfkL∞ khK+M(t − ·) − hK+M(t0 − ·)kL1 . (26) BIBO stability and h ∈ L1(R) is well known, and attributes the result to James, Nichols, and Phillips [8]. It turns out that Next, we invoke Lebesgue’s dominated-convergence theorem R R the pioneers of the theory on control and linear systems were and the property that C0( ) is dense in L1( ) to show that K M K M focusing their attention on analog systems ruled by ordinary kh + (t−·)−h + (t0 −·)kL1 → 0 as t → t0. This, together with (26), implies that limt t0 |(h ∗ f)(t) − (h ∗ f)(t0)| =0, differentiable equations whose impulse responses are sums → of causal exponentials and, therefore, Lebesgue-measurable. which expresses the continuity of t 7→ h ∗ f(t) at t = t0 for K Kailath then presents a proof on p. 176 that is essentially the any t0 ∈ . one we used for Proposition 1, except that he neither considers a limit process nor explicitly says that h must be (locally) ACKNOWLEDGMENTS integrable. The author is extremely thankful to Julien Fageot, Shayan Oppenheim and Willsky discuss the property in [13, p. Aziznejad and Hans Feichtinger for having spotted inconsis- 113-114]. To justify the BIBO stability of the pure time-shift tencies in earlier versions of the manuscript and for their operator (including the identity), they then present an argument helpful advice. He is also appreciative of Thomas Kailath and in support of the inclusion of δ(·−t0) in L1(R) (Example 2.13) Vivek Goyal’s feedback, as well as that of the three anonymous which, in view of the discussion in Appendix A, is flawed. reviewers. Vetterli et al. claim in [20, Theorem 4.8, p. 357] that the operator Th is BIBO-stable from L (R) → L (R) if and REFERENCES ∞ ∞ only if h ∈ L1(R), a statement that is incompatible with [1] Jean-Michel Bony. Cours d’Analyse: Théorie des Distributions et Theorem 3. This can be corrected by limiting the scope of Analyse de Fourier. Editions Ecole Polytechnique, 2001. [2] Nicolas Bourbaki. Elements of Mathematics: 6. Integration. Springer, the equivalence as in the statement of Theorem 2. 2004. [3] Hans G. Feichtinger. A novel mathematical approach to the theory of translation invariant linear systems. In Recent Applications of Harmonic Analysis to Function Spaces, Differential Equations, and Data Science, D. Convolution in the “Unstable” Scenario pages 483–516. Springer, 2017. [4] Gerald B. Folland. Real Analysis: Modern Techniques and their Here, we characterize the output of a potentially “unstable” Applications. John Wiley & Sons, 2013. filter when the input signal is compactly supported. The [5] I. M. Gelfand and N. Ya. Vilenkin. Generalized Functions. Vol. 4. Applications of Harmonic Analysis. Academic Press, New York, USA, enabling hypothesis is the local integrability of the impulse 1964. response. [6] Loukas Grafakos. Classical and Modern Fourier Analysis. Prentice Hall, 2004. Proposition 4. Let f ∈ L (R) be compactly supported and [7] Lars Hörmander. Estimates for translation invariant operators in Lp h ∈ L (R). Then, the function∞ t 7→ (h ∗ f)(t) defined by spaces. Acta Mathematica, 104(1):93–140, December 1960. 1,loc [8] Hubert M. James, Nathaniel B. Nichols, and Ralph S. Phillips. Theory (1) is bounded on any compact set K ⊂ R and continuous; of Servomechanisms, volume 25. McGraw-Hill New York, 1947. that is, h ∗ f ∈ C(R). [9] Thomas Kailath. Linear Systems, volume 156. Prentice-Hall Englewood Cliffs, NJ, 1980. Proof. Because of the local integrability of h, the convolution [10] Hikosaburo Komatsu. Fourier’s hyperfunctions and Heaviside’s pseu- R dodifferential operators. In Takahiro Kawai and Keiko Fujita, editors, integral (1) is well-defined for any t ∈ with Microlocal Analysis and Complex Fourier Analysis, pages 200–214. World Scientific, 2002. [11] Ronald Larsen. An Introduction to the Theory of Multipliers. Springer- (h ∗ f)(t)= h(τ)f(t − τ)dτ = f(x)h(t − x)dx Verlag, Berlin, 1970. R M Z Z [12] Robert E. Megginson. An Introduction to Banach Space Theory. (by change of variable) Springer, 1998. [13] Alan V. Oppenheim and Alan S. Willsky. Signal and Systems. Prentice and Hall, Upper Saddle River, NJ, 1996. [14] Michael Reed and Barry Simon. Methods of Modern Mathematical Physics. Vol. 1: Functional Analysis. Academic Press, 1980. |(h ∗ f)(t)|≤kfkL∞ |h(t ± τ)|dτ < ∞, [15] Walter Rudin. Real and Complex Analysis. McGraw-Hill, New York, ZM 3rd edition, 1987. 10

[16] Laurent Schwartz. Théorie des noyaux. In Proc. International Congress of Mathematics 1950, volume 1, pages 220–230, Providence, RI, 1952. American Mathematical Society. [17] Laurent Schwartz. Théorie des Distributions. Hermann, Paris, 1966. [18] Elias M. Stein and Guido Weiss. Introduction to Fourier Analysis on Euclidean Spaces. Princeton University Press, Princeton, NJ, 1971. [19] François Trèves. Topological Vector Spaces, Distributions and Kernels. Dover Publications, 2006. [20] Martin Vetterli, Jelena Kovaceviˇ c,´ and Vivek K. Goyal. Foundations of Signal Processing. Cambridge University Press, 2014.