<<

SIAM J. SCI. COMPUT. c 2014 Society for Industrial and Applied Mathematics Vol. 36, No. 2, pp. A668–A692

DISCRETE PERIODIC EXTENSION USING AN APPROXIMATE STEP

NATHAN ALBIN† AND SUREKA PATHMANATHAN‡

Abstract. The discrete periodic extension is a technique for augmenting a given set of uniformly spaced samples of a smooth function with auxiliary values in an extension region. If a suitable extension is constructed, the interpolating trigonometric found via an FFT will accurately approximate the original function in its original interval of definition. The discrete periodic extension is a key construction in the algorithm FC-Gram (Fourier continuation based on Gram ) algorithm. The FC-Gram algorithm, in turn, lies at the heart of several recent efficient and high- order-accurate PDE solvers. This paper presents a new flexible discrete periodic extension procedure that performs at least as well as the FC-Gram method, but with somewhat simpler constructions and significantly decreased setup time.

Key words. , nonperiodic functions, Fourier continuation

AMS subject classifications. 42A15, 65T40, 65T50

DOI. 10.1137/130932533

1. Introduction. The purpose of this paper is to offer a straightforward ap- proach for addressing the approximation problem illustrated in Figure 1. In the figure, the solid curve represents a smooth function f(x) on the interval [0, 1], and the solid circular dots represent N samples of f at uniformly spaced nodes in this interval. The problem addressed in this paper is summarized as follows: Main problem: By using only the first d and last d data points, how does one produce an additional M function values in an interval [1,b] with the property that the interpolant of all N + M samples on [0,b] provides a highly accurate approximation of f(x) in the interval [0, 1]? In the figure, the trigonometric polynomial interpolant is represented by the union of the solid and dashed curves. For illustration purposes, the sizes of N, M,andd in the schematic are smaller than the values for these numbers used in practice. In the numerical comparisons presented in section 5, for example, M = 25, d = 10, and N ≥ 100. The algorithm presented can be viewed conceptually as a method for efficiently producing, for any choice of N ≥ d, a sparse (N +M)×N matrix Eper which maps the N samples of the original function to the N + M samples of its periodic extension. The sparseness of the matrix operator derives from the fact that only 2d samples are used to produce the extension, regardless the size of N. As will be shown 2 (see (2.8)), Eper has a very simple structure with only N +2d nonzero entries. Such a matrix, Eper, can be quite useful in practice, as it allows the high-precision approximation of a nonperiodic function f and its derivatives by means of FFT-based interpolation of the extended vector. Moreover, due to the sparsity of Eper and the efficiency of the FFT, the resulting algorithm will have O(N log N) complexity. The

∗Submitted to the journal’s Methods and Algorithms for Scientific Computing section August 12, 2013; accepted for publication (in revised form) November 27, 2013; published electronically April 10, 2014. http://www.siam.org/journals/sisc/36-2/93253.html †Department of Mathematics, Kansas State University, Manhattan, KS 66506 ([email protected]. edu). ‡Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX 79409-1042 ([email protected]). A668 DISCRETE PERIODIC EXTENSION A669

d points d points N points M points

0 1 b

Fig. 1. Schematic diagram of the main problem described in section 1.

construction of Eper is modeled after a simple three-step procedure for solving the continuum version of the main problem: given a function f(x)onaninterval,how can one extend it to a smooth, periodic function f˜(x) on a slightly larger interval? The paper is organized as follows. The remainder of the introduction describes the solution to the continuum ex- tension problem and places the discrete extension problem in context with related constructions. Section 2 introduces the new discrete periodic extension algorithm, and section 3 explores the approximation properties of the new extensions. Section 4 discusses some of the finer details involved in implementing the described algorithm, section 5 presents numerical experiments demonstrating the effectiveness of the exten- sion in approximation, and section 6 summarizes the main conclusions of this work. 1.1. Periodic extension: Fourier methods for nonperiodic functions. The problem of periodic extension can be stated as follows. Problem 1. Let f be a smooth function defined on an interval [0, 1].Asmooth, periodic function f˜ on a larger interval [0,b] ⊃ [0, 1] is called a periodic extension of f if f˜(x)=f(x) for all x ∈ [0, 1]. In this paper, “smooth” will frequently be used as a general term representing some unspecified degree of differentiability. The reader desiring a more formal defini- tion may interpret “smooth” as C∞ or Cp with p ≥ 10. The solution of Problem 1 plays an important role in the generalization of FFT- based computational methods to the nonperiodic setting. Suppose that f is a smooth 1 function on the interval [0, 1]. In general, although the partial Fourier sums PN f, where N/2 b 2πikx/b PN f(x)= fˆke , k=−N/2

do converge pointwise to f on the open interval (0, 1), they do not converge uniformly. More significantly in the context of the present paper, if f is sampled at N uniformly spaced points in the interval (0, 1), then the interpolating trigonometric polynomial 1 IN f,where N/2−1 b 2πikx/b (1.1) IN f(x)= ake k=−N/2

with coefficients ak found by means of the FFT, will not provide a good approximation of f as a result of the associated Gibbs “ringing” artifacts. A670 NATHAN ALBIN AND SUREKA PATHMANATHAN

However, if f can be smoothly extended to a periodic function f˜ with periodicity b on a larger interval [0,b], then the partial Fourier sums PN f˜ will converge uniformly and rapidly at a rate controlled by the smoothness of f˜. This implies that f can be b accurately approximated on its domain [0, 1] by the restriction of IN f˜ for sufficiently large N. Such constructions have numerous applications, including their use in high- order derivative approximations for numerical PDEs solvers [3, 4, 8, 11, 20, 22, 23]. 1.2. Periodic extension in the continuum. The periodic extension problem defined in Problem 1 may be more accurately termed periodic extension in the contin- uum in order to distinguish it from the discrete analogue discussed at the beginning of this paper and revisited in section 1.3. The distinction—which motivates the present paper—is that in the continuum version of the problem, one wishes to extend a func- tion to a periodic function on a larger interval, while in the discrete version, one wishes to extend a vector of discrete samples of a function to a vector of discrete samples of a smooth periodic function on a larger interval. The problem of continuum periodic extension (or approximate extension) has been treated in [1, 2, 6, 13], as well as in the related results presented in [7, 14]. In fact, although they are targeted at the case when f is already known on an ex- tended interval, the presentations in [7, 14] provide strong motivation for the last two steps of the following three-step periodic extension procedure. This procedure for the continuum problem, in turn, motivates the discrete constructions to follow. Smooth extension. Let f be a smooth (Cp) function on the interval [0, 1], that is, f along with several of its derivatives are continuous and bounded on [0, 1]. By using the Taylor polynomial approximations of f at x =0andx = 1, it is trivial to extend f as a smooth function on all of R, as in Figure 2(a). In what follows, E represents such a smooth extension operator. In other words, given a smooth function f on [0, 1], Ef is a smooth function on all of R with the property that

Ef(x)=f(x)forx ∈ [0, 1].

In the present paper, we restrict our attention to the extension operator produced by the dth-degree Taylor polynomial approximation (with d

Windowing. Once the function f is smoothly extended to Ef, the extension can easily be given compact support by multiplication with a smooth windowing function, W , as shown in Figure 2(b). Provided W is chosen with the properties that W (x)=1 for x ∈ [0, 1] and W (x)=0forx/∈ (−δ, 1+δ), the window can be combined with the extension operator to form a new windowed extension operator EW with the following properties (see Figure 2(c)):

EWf = W ·Ef, EWf(x)=f(x)forx ∈ [0, 1], supp (EWf) ⊆ [−δ, 1+δ].

Periodization. Finally, the compactly supported smooth extension EWf can be periodized into a smooth, periodic extension Eperf with period b>0 through the DISCRETE PERIODIC EXTENSION A671

-δ 1+δ -δ 1+δ 3 1 2 0.8 1 0.6 0 0.4 -1 0.2 -2 0 -3 -0.5 0 0.5 1 1.5 -0.5 0 0.5 1 1.5 (a) (b)

-δ 1+δ b 3 3

2 2

1 1

0 0

-1 -1

-2 -2

-3 -3 -0.5 0 0.5 1 1.5 -0.5 0 0.5 1 1.5 (c) (d)

Fig. 2. In (a), afunctionon[0, 1] (dark curve) is extended smoothly to all of R (light curve). The resulting function is multiplied by the windowing function (b) to produce a smooth extension with support in [−0.1, 1.1] (c). In (d), this function is periodized to period b =1.15 via (1.3).

operator Eper defined by ∞ (1.3) Eperf(x)= EWf(x + kb). k=−∞

Provided the period b is chosen so that b ≥ (1 + δ), the resulting smooth, periodic function will remain an extension of the original function f as in Figure 2(d):

Eperf(x)=f(x)forx ∈ [0, 1].

1.3. Discrete periodic extension. The discrete periodic extension described at the beginning of this paper is naturally a discrete analogue of Problem 1. Given a N vector of samples f =(fj) ∈ R of f at the uniformly spaced points 0 = x1,x2,..., xN = 1 with step size h = xj+1 − xj , the goal is to produce an extended vector of N+M values ˜f =(f˜j ) ∈ R with the property that f˜j = fj for j =1, 2,...,N.To establish the connection with the continuum problem, define

(1.4) xj =(j − 1)h, j ∈ Z.

Then {x1,x2,...,xN+M } form an equispaced set of nodes on the periodic interval [0,b], where b =1+(M +1)h =(N + M)h. These nodes should be interpreted as the nodes at which the periodic extension f˜ is sampled. A672 NATHAN ALBIN AND SUREKA PATHMANATHAN

N+M Clearly, any choice of {f˜j }j=N+1 will produce a discrete extension ˜f. However, not all extensions are equally useful. For many applications, it is also necessary that b IN+M f˜(x) ≈ f(x)everywherein[0, 1]. The discrete version on Problem 1 can be stated as follows. Problem 2. Let h>0 be a given step size for a set of N + M uniformly spaced nodes

{0=x1,x2,...,xN =1,xN+1,xN+2,...,xN+M = b − h}.

N Let f be a smooth function on [0, 1] with samples f =(fj )=(f(xj )) ∈ R .Adiscrete N+M periodic extension of f is a vector ˜f =(f˜j ) ∈ R with the properties that f˜j = fj b for j =1, 2,...,N, and the trigonometric polynomial interpolant IN+M f˜ of the points N+M {(xj , f˜j )}j=1 provides an accurate approximation of f(x) in the interval [0, 1]. 1.4. Fourier continuation. The problem of Fourier continuation (FC), or N Fourier extension is closely related to Problem 2. Given a set of samples {(xj ,fj)}j=1 of a smooth function f on the interval [0, 1], the goal of FC is to produce Fourier W coefficients {fˆk}k=−W such that

W (1.5) f˜(x)= fˆk exp(2πikx/b) ≈ f(x)forx ∈ [0, 1]. k=−W

That is, f˜ is a -limited, b-periodic function which approximates f in the interval [0, 1]. The bandwidth W is not specified above, although, for efficiency of use in FFT-based algorithms, it is important that W be kept as small as possible. The periodicity, b,off˜ is also not specified. However, since the goal of FC is to produce an approximation f˜(x) ≈ f(x)in[0, 1], it is clear that a necessary condition for high- order accuracy is that b>1. If b = 1, then the would destroy the accuracy of the approximation except when f is the restriction of a 1-periodic smooth function. This method was first introduced in [6, 9, 10] and there has been a growing body of work surrounding the FC method since. A recent, related work [12] provides a stable multidimensional algorithm for obtaining trigonometric interpolants of smooth periodic functions given values at arbitrary points (possibly restricted to a strict subset of the periodic cell). FC algorithms have been shown to produce superalgebraic or even exponentially accurate approximations (depending on the initial smoothness of f) [13, 19]. More- over, FC can be performed stably [2] and efficiently [18] and requires few points per wavelength to resolve oscillatory data [1]. However, despite its remarkable accuracy in representing smooth, generally nonperiodic functions by means of Fourier series, there is as yet no known methodology for utilizing these FC methods in stable PDEs solvers. 1.5. The FC-Gram approach. The FC-Gram method, on the other hand, has been used successfully in a wide variety of highly accurate and robust PDEs solvers [3, 4, 8, 11, 20, 22, 23]. The secret of FC-Gram’s success appears to be related to the delicate balance between accuracy and stability in solvers for initial boundary-value PDEs. Indeed, by reducing the convergence order from superalgebraic (or exponential) to only polynomial (see Principle 1 in section 1.6 below), the FC- Gram approach has been shown to produce high-order accurate, stable PDEs solvers with exceptionally small artificial errors. DISCRETE PERIODIC EXTENSION A673

The FC-Gram algorithm proceeds by first forming a discrete periodic extension, as defined in Problem 2, and then producing the coefficients {fˆk} in (1.5) by means of an FFT. The “Gram” of FC-Gram refers to the use of an orthonormal Gram polynomial basis (see section 2.1) in the periodic extension stage, with the order of the polynomial approximation determining the order of accuracy of the resulting solver. The computational power of the FC-Gram method arises from its efficiency. In order to utilize FC-Gram, an extension database is first generated offline. Once this database has been generated, any vector of any length N canbeextendedina very small O(1) time, leading to the computational efficiency of the method. Although the FC-Gram method has been used successfully to produce a variety of high-order, efficient PDEs solvers, there are two reasons to seek a replacement for the construction. First, the FC-Gram precomputation is fairly complex, requiring approximations on auxiliary fine grids. Although the method is not unreasonably dif- ficult to implement, a simplified construction could certainly render FC-based solvers more accessible and more amenable to analysis. Second, the precomputation step is computationally intensive. While this does not impact the efficiency of the solvers utilizing FC-Gram, it does complicate the pro- cess of solver development. The FC-Gram construction provides a number of tunable parameters (e.g., the order of polynomial approximation, the number of points in the extension region, and the locations of the sample nodes), and a different database must be computed for each choice of these parameters. The database computation requires the evaluation in very high precision of the SVDs of moderately large matri- ces. Because of this, the generation of an extension database for a single parameter set can require several minutes, and “parameter sweeps” for testing and development purposes can quickly become cumbersome. An improvement in the offline comput- ing time can significantly decrease the time required for parameter studies in the development of new FC-based solvers. 1.6. Accuracy and convergence order. There is an important distinction to be made between the usual treatment of Problem 1 and the FC-Gram treatment of Problem 2. The latter approach—and the approach adopted in this paper—can be summarized by two basic principles. Principle 1. Polynomial convergence order is acceptable, provided that high accuracy can be practically attained. Principle 2. Errors with magnitudes near machine precision are tolerable. In practice, Principle 1 has allowed the development of the variety of PDEs solvers presented in [3, 4, 8, 11, 20, 22, 23], which have formally polynomial convergence order that can be observed at domain boundaries, but which also have spectral-like behav- ior (e.g., exceptionally low dispersion errors) within the interior of the computational domain. (The second example of section 5 demonstrates the spectral nature of the method within the interior.) Principle 2 allows a great deal of freedom in the fol- lowing constructions that is not generally allowed in the treatment of Problem 1. In particular, this principle allows the use of the mildly discontinuous window function presented in section 2.2. 1.7. Aim. The aim of the present paper is to present a simplified discrete pe- riodic extension, analogous to the three-step procedure for the continuum problem outlined in section 1.2, which may act as a replacement for the more complex FC- Gram algorithm. Thus, while there is considerable freedom in the constructions to follow and one may naturally wonder which choices lead to the “best” extension in some sense, the goal of the present paper is less lofty. We merely wish to present A674 NATHAN ALBIN AND SUREKA PATHMANATHAN

a discrete version of the three-step extension process that performs as well as the original FC-Gram construction. The value of this replacement for FC-Gram lies in the significantly decreased computational time required in the offline precomputations. In both the original FC- Gram algorithm and in the algorithm presented in this paper, the cost of computing the extension database is dominated by the cost of solving a linear system. As stated in section 1.5, the offline of FC-Gram requires the very-high-precision computation of SVDs of moderately large (e.g., 91 × 35) matrices. The method presented here, however, only requires the solution of much smaller (e.g., 16×16) linear systems. This has the effect of reducing the time required to compute an extension database from a few minutes to a few seconds, thus significantly reducing the time needed to perform parameter studies. By replacing FC-Gram by this simplified construction, we hope not only to make FC methods more accessible to the computational community but also to facilitate the future development and analysis of FC methods and FC-based PDEs solvers.

2. The algorithm.

2.1. Smooth extension. The first step in the discrete periodic extension is the discrete analogue of the smooth extension of (1.2): given the vector f ∈ RN , produce a ∞ {f˜j}j=−∞ with the property that f˜j = fj for j =1, 2,...,N. This sequence can be interpreted as the samples of a smooth function f˜ with f˜j = f˜(xj ), where xj is defined in (1.4). The FC-Gram extension presented in [3, 4, 8, 11, 20, 22, 23] depends only on some number d of “near-boundary” function values {f1,f2,...,fd} near x =0and{fN−d+1,fN−d+2,...,fN } near x = 1. The extension is based on the values of the interpolating polynomials near these two boundary points. A natural choice of discrete extension in the present context, then, is as follows. Let p and pr be the unique, lowest-degree polynomials with the respective properties that

d p interpolates {(xj ,fj )}j=1 and N pr interpolates {(xj ,fj )}j=N−d+1.

Define the extension {f˜j} as follows: ⎧ ⎨⎪p(xj )ifj<1, f˜j = fj if 1 ≤ j ≤ N, ⎩⎪ pr(xj )ifj>N.

As noted in [3, 4], the number of points d does not need to be chosen the same on the left and right sides of the interval. This observation leads to the biased- order extensions presented in those references. There is also no reason to restrict the extension to polynomial interpolants. However, as this choice produces the results desired for the present paper, it is a natural and easy choice. Finally, since the extended sequence will eventually be multiplied by a window with compact support, only a finite number of the values {f˜j} ever need be computed. Henceforth, we will assume that f ∈ RN is extended by polynomial extrapolation (with possibly differing N+2M degrees d and dr on the left and right, respectively) to a new vector ˜f ∈ R , where M is the width (in grid points) of the window’s transition region ([−δ, 0] or [1, 1+δ] in Figure 2(b)). DISCRETE PERIODIC EXTENSION A675

This operation can be written in a computationally convenient matrix form as follows. Let A and B be the two Vandermonde matrices ⎛ ⎞ 2 d−1 1 x1 x1 ··· x1 ⎜ 2 d−1⎟ ⎜1 x2 x2 ··· x2 ⎟ ⎜ 2 d−1⎟ ⎜1 x3 x3 ··· x3 ⎟ A = ⎜ ⎟ and ⎝. . . . . ⎠ ...... 2 d−1 1 xd x ··· x ⎛ d d ⎞ 2 d−1 1 x1−M x ··· x ⎜ 1−M 1−M ⎟ ⎜ 2 ··· d−1 ⎟ ⎜1 x2−M x2−M x2−M ⎟ ⎜ 2 ··· d−1 ⎟ B = ⎜1 x3−M x3−M x3−M ⎟ . ⎜. . . . ⎟ ⎝. . . . ⎠ 2 d−1 1 x0 x0 ··· x0

That is, A is a d × d matrix and B is a M × d matrix. Given any vector v ∈ d −1 R , the vector A v contains the coefficients (in order of increasing degree) of the d −1 polynomial interpolant of the points {(xi,vi)}i=1, while the vector BA v contains 0 the values of the same interpolating polynomial evaluated at the M points {xi}i=1−M (i.e., the trivial “leftward” extension of the polynomial). Let A = QR be the QR −1 −1 T decomposition of A and define E = BR .ThenBA = EQ . A similar construction near the right endpoint produces another pair of matrices Er and Qr. T T T T d dr N−d−dr Writing f =(f , fc , fr ) ,wheref ∈ R , fr ∈ R ,andfc ∈ R , the extension operation can be expressed in matrix form as ⎛ ⎞ T EQ 00 ⎛ ⎞ ⎜ ⎟ ⎜ I 00⎟ f ⎜ ⎟ ⎝ ⎠ ˜f = Ef = ⎜ 0 I 0 ⎟ × fc . ⎝ ⎠ 00I fr T 00ErQr An alternative interpretation of the Q and E matrices can be given in terms of  d Gram polynomial bases. Let {qj−1}j=1 be the polynomial basis obtained by applying the Gram–Schmidt orthonormalization process to the monomial basis {1,x,x2,..., xd−1} with respect to the inner product

d p, q = p(xi)q(xi), i=1

r dr and define {qj−1}j=1 analogously. Then Q is the orthogonal d × d matrix with values  (2.1) (Q)ij = qj−1(xi),i,j=1, 2,...,d, and E is the M × d matrix with entries

 (2.2) (E)ij = qj−1(x(i−M)),i=1, 2,...,M, j =1, 2,...,d.

Again, Qr and Er are defined analogously for the portion of the extension to the right of the boundary point x = 1. A discussion of computational considerations in the implementation of this extension are delayed until section 4. A676 NATHAN ALBIN AND SUREKA PATHMANATHAN

0.4 1 0.3 0.8 1 0.2 0.6 0.8 0.1 0.6 0 0.4 0.4 -0.1 0.2 -0.2 0.2 -0.3 0 0 -0.4 0 α 1-β 1 0 α α+η 1-β- η 1- β 1 0 0.2 0.4 0.6 0.8 1 (a) (b) (c)

Fig. 3. In (a), a smooth step function transitioning between the values 0 at α and 1 at 1 − β. In (b), the sampling of the step at M =25equispaced√ nodes in the interval√ (α, 1 − β).In(c), the FC-Gram continuation matching the function x →−1/ 3 to x → 1/ 3.

2.2. Windowing and periodization. The extended vector ˜f can now be ta- pered to zero on either end by multiplication with a suitable windowing function. The shape of the window, as shown in Figure 2(b), is determined by its behavior in the transition regions [−δ, 0] and [1, 1+δ]. Assuming a natural symmetry, the smooth window is determined by the smooth “step-up” in the interval [−δ, 0]. Let ψ be a C∞ function on [0, 1] as shown in Figure 3(a), with 0 ≤ ψ ≤ 1, and

ψ(x) ≡ 0forx in a neighborhood of [0,α]and (2.3) ψ(x) ≡ 1forx in a neighborhood of [1 − β,1].

In particular, this implies that all derivatives of ψ vanish at x = α and x =1−β.The significance of α and β will become apparent in section 2.2.1, when the requirements on ψ in [0,α] ∪ [1 − β,1] are relaxed slightly. Any such ψ generates an associated smooth step-up function on [0, 1] defined as

(2.4) S(x)=ψ (α(1 − x)+(1− β)x) , and the resulting window function W (x; δ) is defined as follows: ⎧ ⎪0 x ∈ (−∞, −δ] ∪ [1 + δ, ∞), ⎨⎪ 1 x ∈ [0, 1], W (x; δ)= ⎪ x ∈ − ⎪S 1+ δ x ( δ, 0), ⎩ x−1 S 1 − δ x ∈ (1, 1+δ).

In order to apply the window to the extended vector ˜f constructed in the previous section, we need only know M values of ψ sampled in the transition region (α, 1 − β). As shown in Figure 3(b), the required samples are exactly 1 − α − β (2.5) ψ(α + jη) j =1, 2,...,M, where η = . M +1 The values of ψ outside of (α, 1 − β) are not used in the construction. Instead, in section 2.2.1, the regions [0,α]and[1− β,1] are used to enforce (2.3) approximately, providing a quantitative measure of fitness for approximate step functions. The in- terval (α, 1 − β) corresponds (after scaling and translation) to the interval (−δ, 0) in Figure 2(b). After scaling, the parameter η coincides with the grid step size h, and, DISCRETE PERIODIC EXTENSION A677 thus, the samples defined in (2.5) are exactly the required samples of W (x; δ) within the transition region (−δ, 0). These values can be multiplied elementwise to the columns of E and Er of the previous section to produce the discrete windowed extension operator ⎛ ⎞ ˜ T EQ 00 ⎛ ⎞ ⎜ ⎟ ⎜ I 00⎟ f ⎜ ⎟ ⎝ ⎠ (2.6) ˜fW = EW f = ⎜ 0 I 0 ⎟ × fc , ⎝ ⎠ 00I fr T 00E˜rQr where E˜ and E˜r are the respective counterparts to E and Er, tapered to zero by the smooth step ψ,

 (2.7) (E˜)ij = ψ(α + iη) qj−1(x(i−M)),i=1, 2,...,M, j =1, 2,...,d, and similarly for E˜r. The periodization step is nearly trivial. The simplest periodization is, of course, to treat ˜fW in (2.6) as samples of a periodic function. Another choice is the periodization with smallest period (largest overlap), which can be written in matrix form as ⎛ ⎞ I 00 ⎛ ⎞ ⎜ ⎟ f ⎜ 0 I 0 ⎟ ⎝ ⎠ (2.8) ˜fper = Eperf = × fc . ⎝ 00I ⎠ T T fr E˜Q 0 E˜rQr

For computational reasons, it may be useful to produce extensions with alternative periods (e.g., in order to perform FFTs on vectors whose lengths can be factored into small primes). This is accomplished by replacing E˜ and E˜r in (2.8) by the matrices 0 E˜r and , E˜ 0 respectively, where the 0 represents a zero block of appropriate height. 2.2.1. An optimization problem. What remains, then, is to choose the step function ψ leading to the windowing function W . In the choice of windowing function, there are competing requirements that must be balanced. The window needs to be well-resolved on the discrete grid but with as few grid points as possible. This last requirement arises both from an efficiency perspective as well as from a stability perspective. From the efficiency point of view, it is clear that more points in the transition region leads to more work involved in the FFT. From the stability point of view, the window should decay to zero as rapidly as possible in order to overpower the polynomial growth of the smooth extension. In some ways, this problem is similar to those faced in developing certain “basis localization” techniques that utilize optimized window functions to create compactly supported basis functions from standard bases that retain, as much as possible, the beneficial aspects of the original basis. For example, [16] describes an optimal approx- imate window for the Whittaker cardinal function, and [21] presents an optimized window for the standard trigonometric basis. A678 NATHAN ALBIN AND SUREKA PATHMANATHAN

A number of numerical experiments by the authors have indicated that the con- struction described in this paper works reasonably well for a variety of window func- tions. Here we describe the construction of the most effective window we have iden- tified to date. The key to the construction lies in the two principles declared in section 1.6. First, because Principle 1 has already been invoked in the use of the polynomial approximation in section 2.1, there is clearly no need for a C∞ window. Furthermore, if we accept Principle 2, it is readily clear that the window need not even be continuous, provided the errors caused by the discontinuities are very small—a point also argued, for example, in [6, section 1.6]. This provides a heuristic for choosing a windowing function W .Namely,W need not be smooth or even continuous, merely “almost continuous.” By relaxing this constraint, it is possible to focus on the other aspects of W . In particular, the associated smooth step ψ (see Figure 3) should satisfy ψ(x) ≈ 0in[0,α], and ψ(x) ≈ 1 in [1 − β,1], and it should be sufficiently frequency-limited that its transition region can be resolved by M points. These requirements can be restated in the form of an optimization problem on ψ:

α 1 2 (2.9) minimize ψ(x)2 dx + (ψ(x) − 1) dx, 0 1−β

where the minimum is taken over all ψ of the form W ψ(x)= ak cos(kπx). k=0

Note that, although it is a desirable property of the step function, we do not require that ψ exhibit a (nearly) monotone transition, nor even that the range of ψ is bounded near [0, 1]. In fact, for some choices of parameters, the minimizing ψ certainly does not appear to satisfy such constraints. However, there is reason to believe that the minimizer will behave as a smooth step for some choices of parameters. Consider the graph shown in Figure 3(c). This graph displays the transition over M√ =25 point values√ used by the FC-Gram method between the functions x →−1/ 3and x → 1/ 3. In this case, the FC-Gram extension (see [11, 20]) is computed by an optimization problem very similar to that of (2.9) with the integrals replaced by discrete 2 approximations on a fine auxiliary grid. This function evidently satisfies the step-like behavior we seek. The form of (2.9) allows a semianalytic solution to the optimization problem: the stationarity conditions take the form

T (2.10) Aa = b with a =(a0,a1,...,aW ) .

After straightforward algebraic and trigonometric manipulations, the coefficient of a in the associated with differentiating with respect to ak is found to be

α 1 A+1,k+1 =2π cos(πx)cos(kπx) dx +2π cos(πx)cos(kπx) dx, 0 1−β

which simplifies to

sin(( + k)πα) sin(( − k)πα) +k sin(( + k)πβ) −k sin(( − k)πβ) + +(−1) · +(−1) ·  + k  − k  + k  − k DISCRETE PERIODIC EXTENSION A679

when  = k and to 1 π(α + β)+ 2k (sin(2kπα)+sin(2kπβ)) when  = k =0 , 2π(α + β)when = k =0.

The entry in b corresponding to differentiating with respect to ak is 1 k 2sin(kπβ) (−1) · k if k =0 , bk+1 =2π cos(kπx) dx = 1−β 2πβ if k =0.

Finally, we observe that the problem can be simplified somewhat if α = β,in which case the symmetry of the problem leads to the observation that the optimal choice of coefficients {ak} has the property that ak = 0 for all nonzero even numbers k. With this assumption, ψ has the form

(2.11) ψ(x)=a0 + a1 cos(πx)+a3 cos(3πx)+a5 cos(5πx)+···+ aW cos(Wπx),

where, without loss of generality, W is assumed to be odd. Although the resulting (W +3)/2×(W +3)/2 linear system is poorly conditioned, it can be solved numerically in high precision as described in section 4. Because the intent of the current construction is to reproduce as faithfully as possible the behavior of the FC-Gram approach, we now seek a solution to the problem with the choice M = 25, which is the number of extension points produced by the algorithm employed in [3, 4, 8, 11, 20, 22, 23]. The parameter choices (obtained by some intuition and a bit of trial and error)

α = β =0.15,W=29

produce a smooth step ψ with the desirable properties that (2.12) max |ψ(x)| < 10−15, max |1−ψ(x)| < 10−15, −10−15 <ψ(x) < 1+10−15 x∈[0,α] x∈[1−β,1]

(as can be observed by graphing ψ)andthatM = 25 extension points are sufficient to achieve the accuracy of FC-Gram (as is demonstrated in section 5). The coefficients {ak} as well as the M = 25 discrete samples of ψ for this choice of parameters are presented in Appendix A. 3. Approximation accuracy. As is demonstrated in section 5, the method described in the present paper exhibits virtually identical approximation accuracy as the FC-Gram method when the coefficients of the former are chosen correctly. This is hardly surprising, since the two methods are closely related; the algorithms for constructing periodic extensions may differ in detail, but they are similar in spirit. The present section presents an analysis of the approximation accuracy of the new method, which has several characteristics similar to the FC-Gram estimates in [17, section 5.2]. p To simplify the notation, suppose d = dr = d.Letf be a C function on [0, 1] d with p>d, and let Ef be the C extension of f defined in (1.2). Let fj = f(xj ) be the sampled values of f in the interval [0, 1] at the N uniformly spaced points 0=x1,x2,...,xN = 1. The discrete periodic extension operator Eper defined in (2.8) extends the vector of sampled function values fj by sampling in the interval [1,b]a combination of two polynomials p and pr which are blended together by means of A680 NATHAN ALBIN AND SUREKA PATHMANATHAN

a smooth step function. More specifically, p and pr are the unique, lowest-degree polynomials with the respective properties that d p interpolates {(b + xj ,fj)}j=1 and (3.1) N pr interpolates {(xj ,fj)}j=N−d+1. This definition differs slightly from that in section 2.1; as a result of the periodization procedure, the present p is a translation of the previous p by the length of the peri- odic interval b. Because the following analysis does not depend on a particular choice of step function, suppose that ψ is any C∞ function on [0, 1] with the properties that (3.2) max |ψ(x)| < , max |1 − ψ(x)| < , − <ψ(x) < 1+ x∈[0,α] x∈[1−β,1] for some small >0, and define the associated approximate step function as in (2.4). By rescaling this step function to the interval [1,b], one can define two additional approximate step functions x − 1 b − x (3.3) s = S ,sr = S . b − 1 b − 1

The functions s and sr are, respectively, an approximate smooth step-up and step- down on the interval [1,b]. The behavior of the discrete periodic extension operator Eper can now be understood through the function f(x)ifx ∈ [0, 1], (3.4) f per(x)= s(x)p(x)+sr(x)pr(x)ifx ∈ [1,b].

per Indeed, the values of ˜fper in (2.8) are exactly the sample values of f as defined above at the points xi =(i − 1)h for i =1, 2,...,N + M. The question of approximation accuracy can now be stated as follows. Let If per = b per per IN+M f be trigonometric polynomial interpolant of f defined in (1.1). To what degree does the function If per approximate f(x) in the interval [0, 1]? In other words, per per we seek a bound on f−If L∞[0,1]. Because f is, in general, not even continuous, the standard estimates [15, Chapter 4] do not immediately prove high-order accuracy. However, as in [17, section 5.2], it is possible to circumvent this issue up to a very small error by smoothing. In other words, f per is very close to a Cd function gper (defined below) that, in turn, provides a high-order accurate approximation of f in the interval [0, 1]. Keeping in mind (3.2), we observe that there exists a C∞ function ψ on [0, 1] satisfying (2.3) along with (3.5) ψ − ψL∞ ≤ 2 and − <ψ(x) < 1+ . Such a function can be constructed as follows. First, by continuity, there exist numbers α and β satisfying α<α < 1 − β < 1 − β such that max |ψ(x)|≤2 and max |1 − ψ(x)|≤2 . x∈[0,α] x∈[1−β,1] Let W be a C∞ window function 0 ≤ W ≤ 1withsupportin(α, 1 − β) such that W ≡ 1on[α , 1 − β ] and let H(x) be the Heaviside function taking the value 0 for x<0and1forx>0. The function 1 ψ(x)=W (x)ψ(x)+(1− W (x))H x − 2 DISCRETE PERIODIC EXTENSION A681

has the desired properties. Indeed, on a neighborhood of [0,α] ∪ [1 − β,1], ψ agrees with H(x − 1/2), so ψ satisfies (2.3). Moreover, (3.6) ψ−ψL∞ =max max |(1 − W (x))ψ(x)|, max |(1 − W (x))(ψ(x) − 1)| ≤ 2 , x∈[0,α] x∈[1−β,1] and the smooth blending with the Heaviside function cannot cause ψ to take values outside the original range of ψ, contained in the interval (− , 1+ ). Define S, s,and per per sr from ψ as in (2.4) and (3.3) and the auxiliary functions h and g as follows: f(x)ifx ∈ [0, 1], (3.7) hper(x)= s(x)p(x)+sr(x)pr(x)ifx ∈ [1,b], f(x)ifx ∈ [0, 1], (3.8) gper(x)= s(x)Ef(x − b)+sr(x)Ef(x)ifx ∈ [1,b].

The approximation error can be bounded as follows, using the notational convention b I = IN+M :

(3.9) per f −If L∞[0,1] per per per per = g −If L∞[0,1] ≤g −If L∞[0,b] per per per per per per ≤g −Ig L∞[0,b] + Ig −Ih L∞[0,b] + Ih −If L∞[0,b] per per per per per per = g −Ig L∞[0,b] + I (g − h ) L∞[0,b] + I (h − f ) L∞[0,b]. Each of the three terms on the final line will be treated in turn, beginning with the last. In what follows, it is useful to assume that N is sufficiently large, e.g., that

(3.10) N ≥ max{100,M},

which allows the use of explicit constants in the estimates to follow. The estimates rely on the following two propositions describing the behavior of Ef, p,andpr in the extension region [1,b]. Proposition 3.1. Let f be a Cp function with p>d,andletEf be as defined in (1.2).LetC be a number such that max f (j)(0) , f (j)(1) ≤ Cj ,j=1, 2,...,d.

Then for all x ∈ [1,b]=[1, 1+(M +1)h],

|Ef(x)|≤|f(1)| + eC(M+1)h − 1 and |Ef(x − b)|≤|f(0)| + eC(M+1)h − 1.

Proof. By definition of Ef, for any x, d j d j (j) (x − 1) (C|x − 1|) C|x−1| |Ef(x) − f(1)|≤ f (1) ≤ ≤ e − 1, j=1 j! j=1 j!

and similarly for the difference between Ef(x)andf(0). The proposition follows from the triangle inequality and the bounds on x. Proposition 3.2. Let f and Ef satisfy the hypotheses of Proposition 3.1,and let p and pr be as defined in (3.1). Then, for any x ∈ [1,b]=[1, 1+(M +1)h],there A682 NATHAN ALBIN AND SUREKA PATHMANATHAN

exist numbers z ∈ [0, (d − 1)h] and zr ∈ [1 − (d − 1)h, 1] such that M + d d (d) |pr(x) −Ef(x)|≤ h f (zr) and d M + d d (d) |p(x) −Ef(x − b)|≤ h f (z) . d

d+1 Asymptotically, to an error of order O(h ), z can be taken to be 0 and zr can be taken to be 1. Proof. The standard error bound on polynomial interpolation implies that for all x ∈ [1,b], there exists a zr ∈ [1 − (d − 1)h, b] such that N 1 (d) Ef(x) − pr(x)= (x − xj )Ef (zr). d! j=N−d+1

(d) Since Ef is a dth degree polynomial in [1,b], Ef is constant there, and thus zr can be taken to lie in the interval [1 − (d − 1)h, 1]. The polynomial portion of this error is largest when x = b =(N + M)h, providing the first estimate of the proposition. The second estimate is similar. The asymptotic estimate follows from the fact that (d) zr =1+O(h), z =0+O(h) and the assumed continuity of f . 3.1. Error due to small discontinuities. The final term of (3.9) is the supre- mum norm of the trigonometric polynomial interpolant of the difference hper − f per which, by construction, is identically zero in [0, 1]. Thus, using (3.6), we have

per per per per h − f L∞[0,b] = h − f L∞[1,b]

≤(s − s)pL∞[1,b] + (sr − sr)prL∞[1,b] ≤ 2 pL∞[1,b] + prL∞[1,b] .

Propositions 3.1 and 3.2 imply that (3.11) C(M+1)h M + d d (d) max{pL∞[1,b], prL∞[1,b]}≤fL∞[0,1] +e −1+ h f L∞[0,1], d and so (3.12) per per C(M+1)h M + d d (d) h − f L∞[0,b] ≤ 4 fL∞[0,1] + e − 1+ h f L∞[0,1] , d where C is the constant defined in Proposition 3.1. It is trivial by linear interpolation to construct a continuous, periodic function R(x) which agrees with hper−f per at each per per of the M + N interpolation nodes and is uniformly bounded by h − f L∞[0,b]. To R(x), we apply the following theorem (Lemma I on p. 120 of [15]). Theorem 3.3. Let f(x) be a continuous, b-periodic function and let If = b IN+M f be the trigonometric polynomial interpolant (as defined in (1.1))off at N+M equispaced nodes with N and M satisfying (3.10).Then

IfL∞ ≤ 2.03 log(N)fL∞ .

Proof. Since the L∞ norm is under change of variables, the period of f(x) is irrelevant. The proof (without explicit constant) is given in the reference. DISCRETE PERIODIC EXTENSION A683

There, the proof is given using the and cosine series interpolant using either 2n or 2n + 1 nodes. In the present context n = (N + M)/2,so N + M N + M − 1 ≤ n ≤ . 2 2 The constant 2.03 arises (see p. 120 of [15] near the bottom) from the fact that for N and M satisfying (3.10), 4 ≤ 4 4+logn = 1+ log n 1+ − log n log n log((N + M)/2 1) 4 ≤ 1+ log n<2.03 log n log(49) and N + M log n ≤ log ≤ log N. 2 Applying Theorem 3.3 to R(x) then implies that per per I (h − f ) L∞[0,b]

= IRL∞[0,b] (3.13) C(M+1)h M + d d (d) ≤ 8.12 log(N) fL∞[0,1] + e − 1+ h f L∞[0,1] . d 3.2. Polynomial approximation error. The middle error term at the end of (3.9) is the error produced by approximating Ef(x) by the polynomial interpolants of f. Recall that Ef(x), defined in (1.2), is a Cd extension of f to all of R by means of dth degree Taylor polynomial approximation, and that p and pr are the (d−1)-degree polynomial interpolants of f defined in (3.1). Since hper and gper are both continuous, Theorem 3.3 again applies, yielding (3.14) per per I (g − h ) L∞[0,b] per per ≤ 2.03 log(N)g − h L∞[0,b]

=2.03 log(N)s(x){Ef(x − b) − p(x)} + sr(x){Ef(x) − pr(x)}L∞[1,b] ≤ 2.03(1 + ) log(N) Ef(x − b) − p(x)L∞[1,b] + Ef(x) − pr(x)L∞[1,b]

for all N satisfying (3.10). Here we have used the bound |s(x)|, |sr(x)|≤1+ implied by (3.5). Thus, this term of the approximation reduces to an estimate of the degree to which the Taylor polynomials and interpolating polynomials of f can differ in the interval [1,b]. From (3.14) and Proposition 3.2, it follows that per per M + d d (d) (3.15) I (g − h ) L∞[0,b] ≤ 4.06(1 + ) log(N) h f L∞[0,1] . d 3.3. Interpolation error. The remaining error term—the first term in the fi- nal line of (3.9)—is the error produced in approximating gper by its trigonometric per b per polynomial interpolant Ig = IN+M g . The natural estimate would be to invoke the following theorem (Corollary III on p. 123 of [15]). p b Theorem 3.4. Let f(x) be a C b-periodic function and let If = IN+M f be the trigonometric polynomial interpolant (as defined in (1.1))off at N + M equispaced A684 NATHAN ALBIN AND SUREKA PATHMANATHAN

nodes with N and M satisfying (3.10).Then −p (p) f −IfL∞ ≤ CpN log Nf L∞

for a constant Cp depending only on p. Applying this to gper gives per per −d per (d) (3.16) g −Ig L∞[0,b] ≤ CdN log(N)(g ) L∞[0,b]

for some constant Cd. The difficulty in this approach can be seen by examination of per (3.8). In the interval [1,b], g is defined through the step function s and sr,which areinturndefinedthroughafixedstepfunctionS as in (3.3). As h → 0, the dth derivative of gper will tend to grow proportionally to (b − 1)−d =((M +1)h)−d = O(N d), canceling the rapid decay of the N −d term in (3.16). This is, of course, due to the fact that M has been restricted to be a fixed number independent of N. If, for example, M had been chosen instead to satisfy M ≥ cN for some constant c, then the dth derivative of gper would instead grow proportionally to N d−1, which would force the interpolation error to decay (albeit slowly) with increasing N. The key observation here is that gper is constructed from a fixed function Ef and per a windowing function defined through s and sr. It is evident that in order for g to be well-approximated by its trigonometric polynomial interpolant, both Ef and the window function need to be well-resolved in some sense. Recalling that the step function producing the window was constructed explicitly to be well-resolved by M points, we might expect (3.16) to hold with the derivative of gper replaced by that of Ef, which would then provide order N −d log(N) convergence. To obtain a sense of how well-resolved the windowing function of section 2.2 25 is, we proceed as follows. Let ψ ∈ R be the vector of discrete samples given in Appendix A, and let ψr be the analogous step-down samples (i.e., ψr is just ψ in reversed order). Let Z and N be positive and define 0Z to be the vector made up of Z zeros, and 1N to be the vector made up of N ones. Then the vector ⎛ ⎞ 0Z ⎜ ⎟ ⎜ψ ⎟ ⎜ ⎟ w = ⎜1N ⎟ ⎝ ⎠ ψr 0Z represents the discrete samples of a windowing function W . Treating w as samples of a periodic function, letw ˆ be the vector of discrete Fourier coefficients of w and let wˆtail be the result of zeroing out the 80% lowest frequency coefficients ofw ˆ (i.e.,w ˆtail contains only the “tail” made up of the modes from 80% of the highest frequency and higher). A quantification of resolution, then, can be obtained by checking that the tail contains almost none of the energy content of the entire window. That is, the quantity 2 wˆtail (3.17) R = wˆ2 should be small. Figure 4 displays the values of R for the choice Z = 10, using 1,000 randomly selected values of N between 10 and 10,000. Evidently, the discrete windowing function is well-resolved for all values of N in this range. DISCRETE PERIODIC EXTENSION A685

1e-18

1e-19

1e-20 R 1e-21

1e-22

1e-23 10 100 1000 10000 N

Fig. 4. The resolution indicator R defined in (3.17) as a function of N.ValuesofR are plotted against N on a log-log axis for 1,000 randomly selected values of N between 10 and 10,000.

3.4. A heuristic bound. We now argue for a heuristic bound. Suppose is small (e.g., <10−15) and let C be defined as in Proposition 3.1. Then the bounds in (3.13) and (3.15) combined are bounded from above by M + d d (d) C(M+1)h 4.07 h log Nf L∞[0,1] +8.12 log N fL∞[0,1] + e − 1 , d the sum of an algebraically decaying term and a small term that eventually grows logarithmically. To give a sense of the size of the constants, suppose that M = 25, d = 10, =10−15, C = 100 (see Example 3 of section 5), and N = 100. Then the bound on (3.13) and (3.15) becomes

−11 (d) −14 −3 3.80 × 10 f L∞[0,1] +3.74 × 10 fL∞[0,1] +9.52 × 10 .

With N =10,000, the bound is

−31 (d) −14 −14 6.89 × 10 f L∞[0,1] +7.48 × 10 fL∞[0,1] +2.22 × 10 .

Based on the heuristic argument for the rapid decay of the interpolation term, it might be expected that with =10−15, the approximation error in the discrete periodic extension satisfies asymptotically (3.18) per −d −14 (heuristic) f−If L∞[0,1] ≤ O N log(N) +10 log N fL∞[0,1] + O(h) .

4. Computational considerations. The discrete periodic extension described in the previous sections relies on several numerical calculations that should be treated with some care. Potential concerns include the following: • The construction of the orthogonal polynomials used in section 2.1 requires the QR decomposition of a Vandermonde matrix (or some equivalent calcu- lation). • The polynomial interpolation used in section 2.1 is performed on uniformly spaced points, which immediately brings to mind Runge’s famous example. Moreover, these polynomials are then used for extrapolation in the extension interval. • The matrix A in (2.10) is ill-conditioned. As shown in what follows, each of these concerns can be addressed adequately by the use of a one-time “offline” precomputation phase (similar to that of the FC- A686 NATHAN ALBIN AND SUREKA PATHMANATHAN

Gram construction) which uses high-precision arithmetic to circumvent conditioning problems. In the offline phase, an extension database consisting of the matrices E˜, E˜r, Q,andQr is computed carefully using variable precision arithmetic and is stored (in double precision) for future use. This extension database can then be utilized any time an extension is needed. In the online phase, extensions are performed rapidly via (2.8) using the database. For the examples presented in this paper, the algorithms described in the present section were implemented using the variable-precision arithmetic (vpa) capabilities provided by the Symbolic Math Toolbox of MATLAB. Although no effort was made to optimize the parameter, the value digits=200—instructing MATLAB to use 200 digits of precision in its calculation—is sufficient for the calculations described in the present paper. In what follows, we describe the computation of Q and E˜. Qr and E˜r are computed analogously. For notational convenience, the subscript  is suppressed. 4.1. Orthogonal polynomials and extrapolation. In order to avoid errors arising from the QR decomposition of the Vandermonde matrix, the matrices Q and E are computed using vpa and the standard recursion

qn+1(x)=an+1((x + bn+1)qn(x)+cn+1qn−1(x)),n=2, 3,....

The recursion begins with the first two polynomials, q0(x) ≡ 1/ d and q1(x)=a(x + b) with b chosen so that q0,q1 =0anda chosen so that q1 = 1. Subsequent polynomials are computed according to the recursion, with bn+1 and cn+1 chosen to ensure orthogonality with qn and qn−1 and an+1 chosen to normalize qn+1.Aseach polynomial is determined, its values are used to fill in the appropriate columns of Q and E according to (2.1) and (2.2). For the examples presented in this paper, the entries of Q were computed using the spatial grid {0, 1, 2,...,d− 1},andtheentries of E were evaluated on the grid {−M,−M +1,...,−1}.OnceQ and E are computed in very high precision, they are converted to double precision. 4.2. Polynomial interpolation and extrapolation. In addressing the con- cern of polynomial interpolation on equispaced points, it is important to keep in mind that the number of points d is fixed (to d = 10 in the examples of this paper), while h decreases to zero. Thus, the interpolation error at any point decreases rapidly to zero as h decreases. However, because the polynomials are extrapolated—that is, evaluated outside of the interpolation region—in the construction of E, a natural concern is that the vector max norm ˜fper∞ of ˜fper defined in (2.8) is much larger than f∞. In the context of per section 3, this means that f L∞[1,b] is much larger than fL∞[0,1]. Numerically, this would be problematic, since it would imply that the FFT coefficients of ˜fper are very large and, thus, that some significant cancellation occurs in the evaluation of the trigonometric polynomial interpolant in the interval [0, 1] where f is relatively small. Naturally, this would suggest a numerical loss of accuracy. In order to address this concern, observe that (3.11) shows that the magnitude of f per cannot grow too much away from the magnitude of f. In the case, for example, of f(x)=e−100x using M =25andd = 10 (see section 5 for the numerical results), per C can be taken to be 100. For h =1/500, we have that f L∞[1,b] can be no larger than approximately 400, and we should not expect to lose more than a few digits DISCRETE PERIODIC EXTENSION A687

2 1 1 1.5 0.9 0.8 0.8 1 0.7 0.6 0.5 0.6 0 0.5 0.4 -0.5 0.4 0.3 -1 0.2 0.2 -1.5 0.1 0 -2 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (a) (b) (c)

Fig. 5. Plots of the functions used in the numerical experiments presented in section 5. to cancellation errors in the use of the FFT. In fact, this approximation tends to be rather pessimistic. In the case just described, for example, the actual L∞ norm of the extension is less than 8. 4.3. Solving for the approximate step. The linear system in (2.10), with coefficients given in section 2.2.1, also tends to be poorly conditioned. The procedure forsolvingthelinearsystemisasfollows. The matrix of coefficients A and right- hand-side vector b are initialized as a vpa matrix and vector with the coefficients given previously. The linear system is then solved (via the “backslash operator”) for a, which is converted to the double data type. The samples of the smooth step are evaluated in double precision via (2.5). The values for a as well as the sample values of ψ for the approximate step used throughout the present paper are reported in Appendix A. Once the values of ψ are known, these values are multiplied elementwise by the columns of E, as in (2.7), to produce E˜. 5. Numerical experiments. As stated in section 1.7, the aim of the present paper is to present a simplified discrete periodic extension to act as a replacement for the more complex FC-Gram algorithm. The present section presents evidence that this goal has been accomplished: although the FC-Gram approach and the algorithm presented in this paper do not produce identical discrete periodic extensions, they do exhibit almost identical behavior in terms of representation of the original function in its original interval. This is demonstrated in three examples. In each example, a function f(x) (shown in Figure 5) on the interval [0, 1] is selected, and the methods are compared in terms of representation error as follows. Given N uniformly spaced samples of f in the interval, we have described two methods for discrete periodic extensions: the original FC-Gram method and the new periodic extension using the approximate window constructed in section 2.2.1. In both cases, d = dr = 10 near-boundary samples are used to produce the extension of M = 25 extension points. Once the extensions are computed, the resulting trigonometric polynomial inter- polants are evaluated on a finer grid with 20 times as many nodes by means of FFTs and zero padding. The representation error is estimated by approximating the L∞ error and L2 error in the interpolant using all nodes on the refined grid lying in the original interval. Convergence rates of the form aN −b log(N)ora exp(−bN)arees- timated by a least squares fit on logarithmic axes of the parameters a and b to the error data for a range of N. A688 NATHAN ALBIN AND SUREKA PATHMANATHAN

orig. FC-Gram orig. FC-Gram 1e-1 new method 1e-1 new method

1e-4 1e-4

1e-7 1e-7 error 2 L max error 1e-10 1e-10

1e-13 1e-13

0 500 1000 1500 2000 0 500 1000 1500 2000 # samples # samples (a) (b)

Fig. 6. The approximate representation error in max norm (a) and L2 norm (b) observed when applying discrete periodic extensions to the function f(x)=sin(60x) exp(cos(40x)).Thepoints labeled “orig. FC-Gram” indicate the results of applying the original FC-Gram algorithm with M =25extension points. The points labeled “new method” indicate the results of applying the new periodic extension developed in the present paper using the approximate windowing function described in section 2.2.1 and M =25extension points. In both cases, d = dr =10near-boundary samples were used to compute the extensions. The rates of convergence are approximated by the solid curves in (a) and (b), defined by the functions exp(47.95)N −10.8 log(N) and exp(47.76)N −11.1 log(N),fit to the data in the ranges [600, 1500] and [500, 1500], respectively.

Regarding estimates of the L2 error, (3.18) clearly implies that the same estimate ∞ 2 · 2 ≤ √should hold when the L norm is replaced by the L norm, because L [a,b] b − a·L∞[a,b]. In fact, one might hope to do slightly better since, for example, the max norms at the end of (3.14) are taken over the O(h)-length interval [1,b], thus gaining a multiple of O(N −1/2) in the corresponding L2 estimate. The numerical examples that follow do indeed indicate that the convergence is slightly better in the L2 norm than in the L∞. It is important to note that, although the methods demonstrate similar accuracy, the offline computational time required to precompute the extension databases differ greatly, as discussed in section 1.7. Computing the extension database for 10th- order FC-Gram method (using the MATLAB code developed for use in [3]) requires approximately 200s on a 1.6GHz Intel Xeon CPU. Computing the 10th-order extension database by the new approach requires only 2.5s. Example 1: General nonperiodic function. Figure 5(a) displays the graph of the function f(x)=sin(60x)ecos(40x) , which is certainly not periodic on the interval [0, 1]. Figure 6 displays the approxima- tions of max norm and L2 norm errors for a range of values of N. The error appears to decay a bit faster than the O(N −10 log(N)) expected from the heuristic bound in (3.18) down to a small value of approximately 10−13. Example 2: Challenging in the interior. Figure 5(b) displays the graph of the function 1 f(x)= . 1+104(2x − 1)2 This function has very small derivatives near the boundary but is quite sharp in the interior of the domain. In this case, we expect the errors in (3.13) and (3.15) to contribute very little to the total error, which should instead be dominated by the error in representing a “nearly C∞” function by its trigonometric polynomial DISCRETE PERIODIC EXTENSION A689

1e-1 orig. FC-Gram 1e-1 orig. FC-Gram new method new method 1e-4 1e-4

1e-7 1e-7 error 2 L

max error 1e-10 1e-10

1e-13 1e-13

1e-16 1e-16 0 500 1000 1500 2000 0 500 1000 1500 2000 # samples # samples (a) (b)

Fig. 7. The corresponding plots to those of Figure 6 for the function f(x)=(1+104(2x−1)2)−1. The rates of convergence are approximated by the solid curves in (a) and (b), defined by the functions 2.06 exp(−N/20.2) and 1.06 exp(−N/20.1), respectively. In both cases, the error curve was fit in the range [100, 500].

orig. FC-Gram orig. FC-Gram 1e-1 new method 1e-1 new method

1e-4 1e-4

1e-7 1e-7 error 2 L max error 1e-10 1e-10

1e-13 1e-13

0 500 1000 1500 2000 0 500 1000 1500 2000 # samples # samples (a) (b)

Fig. 8. The corresponding plots to those of Figure 6 for the function f(x)=exp(−100x).The rates of convergence are approximated by the solid curves in (a) and (b), defined by the functions exp(36.57)N −9.61 log(N) and exp(35.98)N −10.0 log(N), fit to the data in the ranges [500, 1200] and [400, 1000], respectively.

interpolant. Figure 7 displays the representation errors for various N.Inthiscase,the convergence appears to be geometric (see [5, Definition 5]). This is quite remarkable since the function differs from Runge’s example only by an affine change of variables; the polynomial interpolants of f(x) using uniformly spaced nodes fails to converge uniformly. Example 3: Challenging near the boundary. Figure 5(c) displays the graph of the function f(x)=exp(−100x). This function, which is representative of what one might expect in the boundary layer of a fluid flow simulation, exhibits rapid decay near one boundary and is thus most challenging to resolve in the region where polynomial approximations are used. Fig- ure 8 displays the approximations of max norm and L2 norm errors for a range of val- ues of N. In this case, the error appears to decay a bit slower than the O(N −10 log(N)) expected from the heuristic bound in (3.18) down to approximately 10−13. To give a sense of the resolution power of the method, Figure 9(a) displays the locations of the sample points when N = 83, which provides a representation error of approximately 0.1%. Nine sample points are needed in the interval [0, 0.1]. Figure 9(b) A690 NATHAN ALBIN AND SUREKA PATHMANATHAN

1 0.0002 f(x) 0.9 samples 0.8 0 0.7 -0.0002 0.6 0.5 -0.0004 f(x) 0.4 error 0.3 -0.0006 0.2 -0.0008 0.1 0 -0.001 0 0.02 0.04 0.06 0.08 0.1 0 0.2 0.4 0.6 0.8 1 x x (a) (b)

Fig. 9. The sample point locations (a) and approximation error (b) for the function f(x)= exp(−100x) with N =83uniformly spaced sample points in the interval [0, 1].

shows the corresponding graph of the error which, as should be expected, is localized near the boundary. 6. Conclusion. This paper presents a new algorithm for discrete periodic extension—a key ingredient in Fourier-based PDEs solvers for nonperiodic problems. The new algorithm, based on the three-step process of smooth extension, windowing, and periodization, provides a viable replacement for the existing FC-Gram method- ology. Three numerical examples were presented, comparing 10th-order implementa- tions of both the original FC-Gram algorithm and the new algorithm. In all three examples, the new algorithm performed nearly identically to the FC- Gram algorithm, while requiring only a small fraction of the setup time; a decrease of 98.75% in setup time was reported in section 5. Thus, the new extension method presented in this paper renders feasible the use of large “parameter sweeps” to explore the impact of various parameter choices on stability, accuracy, etc. of FC-based PDEs solvers. Moreover, the new construction, by avoiding precomputations on an auxiliary fine grid, is somewhat simpler to implement and to analyze. We believe that this faster and simpler approach to periodic extension will have significant impact on the future analysis, development, and testing of FC-based solvers. Appendix A. Coefficients of the approximate step function. The ap- proximate step function ψ used in most of the preceding was defined through the minimization procedure of section 2.2.1 with parameters α = β =0.15 and W = 29. The minimizing choice of coefficient vector a produces a step function ψ (see (2.11)) with maximum error less than 10−15 on [0,α] ∪ [1 − β,1]. The coefficients are as follows:

a0 =+5.000000000000000e − 01,a1 = −6.272559465331051e − 01, a3 =+1.856717861764287e − 01,a5 = −8.774650853929397e − 02, a7 =+4.368094156065561e − 02,a9 = −2.087086267296356e − 02, a11 =+9.196566765000086e − 03,a13 = −3.646913322458731e − 03, a15 =+1.276129637718593e − 03,a17 = −3.864141784553076e − 04, a19 =+9.899552107100009e − 05,a21 = −2.084432853147060e − 05, a23 =+3.462338134287341e − 06,a25 = −4.254463565378016e − 07, a27 =+3.439531331391775e − 08,a29 = −1.373156867409182e − 09. DISCRETE PERIODIC EXTENSION A691

The values of ψ at the M = 25 sample points in (α, 1 − β) as defined in (2.5) are as follows:

ψ(α + jη) = 5.102700344272514e − 13, 3.007868857205240e − 10, 2.855934704428188e − 08, 9.903150431372917e − 07, 1.739236310356656e − 05, 1.835864342108874e − 04, 1.291276842049902e − 03, 6.477932959771333e − 03, 2.431786859079330e − 02, 7.081031095830816e − 02, 1.646106010695066e − 01, 3.132651432031053e − 01, 4.999999999999999e − 01, 6.867348567968952e − 01, 8.353893989304931e − 01, 9.291896890416921e − 01, 9.756821314092068e − 01, 9.935220670402289e − 01, 9.987087231579502e − 01, 9.998164135657892e − 01, 9.999826076368966e − 01, 9.999990096849569e − 01, 9.999999714406530e − 01, 9.999999996992134e − 01, T 9.999999999994899e − 01 .

REFERENCES

[1] B. Adcock, and D. Huybrechs, On the resolution power of Fourier extensions for oscillatory functions, J. Comput. Phys., 260 (2014), pp. 312–336. [2] B. Adcock, D. Huybrechs, and J. Mart´ın-Vaquero, On the numerical stability of Fourier extensions, J. Found. Math., to appear. [3] N. Albin and O. P. Bruno, A spectral FC solver for the compressible Navier–Stokes in general domains I: Explicit time-stepping, J. Comput. Phys., 230 (2011), pp. 6248–6270. [4] N. Albin, O. P. Bruno, T. Y. Cheung, and R. O. Cleveland, Fourier continuation methods for high-fidelity simulation of nonlinear acoustic beams, J. Acoust. Soc. Am., 132 (2012), pp. 2371–2387. [5] J. P. Boyd, Chebyshev and Fourier Spectral Methods, Dover, New York, 2000. [6] J. P. Boyd, A comparison of numerical algorithms for Fourier extension of the first, second, and third kinds, J. Comput. Phys., 178 (2002), pp. 118–160. [7] J. P. Boyd, Asymptotic Fourier coefficients for a C∞ bell (smoothed-“top-hat”) & the Fourier extension problem, J. Sci. Comput., 29 (2005), pp. 1–24. [8] O. Bruno and A. Prieto, Spatially dispersionless, unconditionally stable FC-AD solvers for variable-coefficient PDEs, J. Sci. Comput., 58 (2014), pp. 331–366. [9] O. P. Bruno, Fast, high-order, high-frequency integral methods for computational acoustics and electromagnetics, in Topics in Computational Propagation, Lecture Notes Com- put. Sci. Eng. 31, Springer, Berlin, 2003, pp. 43–82. [10] O. P. Bruno, Y. Han, and M. M. Pohlman, Accurate, high-order representation of com- plex three-dimensional surfaces via Fourier continuation analysis, J. Comput. Phys., 227 (2007), pp. 1094–1125. [11] O. P. Bruno and M. Lyon, High-order unconditionally stable FC-AD solvers for general smooth domains I. Basic elements, J. Comput. Phys., 229 (2010), pp. 2009–2033. [12] S. Chandrasekaran, K. Jayaraman, and H. Mhaskar, Minimum Sobolev norm interpolation with trigonometric polynomials on the torus, J. Comput. Phys., 249 (2013), pp. 96–112. [13] D. Huybrechs, On the Fourier extension of nonperiodic functions,SIAMJ.Numer.Anal.,47 (2010), pp. 4326–4355. [14] M. Israeli, L. Vozovoi, and A. Averbuch, Spectral multidomain technique with local Fourier basis, J. Sci. Comput., 8 (1993), pp. 135–149. [15] D. Jackson, The Theory of Approximation, Vol. 11, AMS, Providence, RI, 1930. [16] J. Knab, Interpolation of band-limited functions using the approximate prolate series (cor- resp.), IEEE Trans. Inform. Theory, 25 (1979), pp. 717–720. [17] M. Lyon, High-Order Unconditionally-Stable FC-AD PDE Solvers for General Domains,Ph.D. thesis, California Institute of Technology, Pasadena, CA, 2009. [18] M. Lyon, A fast algorithm for Fourier continuation, SIAM J. Sci. Comput., 33 (2011), pp. 3241–3260. [19] M. Lyon, Approximation error in regularized SVD-based Fourier continuations, Appl. Numer. Math., 62 (2012), pp. 1790–1803. A692 NATHAN ALBIN AND SUREKA PATHMANATHAN

[20] M. Lyon and O. P. Bruno, High-order unconditionally stable FC-AD solvers for general smooth domains II. Elliptic, parabolic and hyperbolic PDEs; theoretical considerations,J. Comput. Phys., 229 (2010), pp. 3358–3381. [21] G. Matviyenko, Optimized local trigonometric bases, Appl. Comput. Harmon. Anal., 3 (1996), pp. 301–323. [22] K. Shahbazi, N. Albin, O. P. Bruno, and J. S. Hesthaven, Multi-domain Fourier- continuation/WENO hybrid solver for conservation laws, J. Comput. Phys., 230 (2011), pp. 8779–8796. [23] K. Shahbazi, J. S. Hesthaven, and X. Zhu, Multi-dimensional hybrid Fourier continuation- WENO solvers for conservation laws, J. Comput. Phys., 253 (2013), pp. 209–225.