ABSTRACT

NORMAN, MATTHEW ROSS. Investigation of Higher-Order Accuracy for a Conservative Semi-Lagrangian Discretization of the Atmospheric Dynamical Equations. (Under the direction of Dr. Fredrick H. M. Semazzi.)

This study considers higher-order spatial and temporal methods for a conservative semi-implicit

semi-Lagrangian (SISL) discretization of the atmospheric dynamical equations. With regard to spatial accuracy, new subgrid approximations are tested in the Conservative Cascade Scheme (CCS) SL transport algorithm. When developed, the CCS used the monotonic Piecewise Parabolic Method (PPM) to reconstruct cell variation. This study adapts four new non-polynomial

methods to the CCS context: the Piecewise Hyperbolic Method (PHM), Piecewise Double Hy- perbolic Method (PDHM), Piecewise Double Logarighmic Method (PDLM), and Piecewise Rational Method (PRM) for comparison against PPM. Additionally, an adaptive hybrid ap- proximation scheme, PPM-Hybrid (PPM-H), is constructed using monotonic PPM for smooth data and local extrema and using PHM for steep jumps where PPM typically suffers large

accuracy degradation. Smooth and non-smooth data profiles are transported in 1-D, 2-D Cartesian, and 2-D spher- ical frameworks under uniform advection, solid body rotation, and deformational flow. Accu- racy is compared in the L1 error measure. PHM performed up to five times better than PPM for smooth functions but up to two times worse for non-smooth functions. PRM performed very similarly to PPM for non-smooth functions but the order of convergence was worse than PPM for smooth data. PDHM performed the worst of all of the non-polynomial methods for almost every test case. PPM-H outperformed both PPM and all of the new methods for all test cases in all geometries offering a robust advantage in the CCS scheme. Additionally, the CCS and new subgrid approximations were used to perform conservative grid-to-grid interpolation between two spherical grids in latitude / longitude coordinates. The methods were tested by prescribing an analytical sine wave function which was integrated over grid cells at T-42 resolution (approximately 2.8o ×2.8o) and at 1o resolution. Then, the 1o data is interpolated to the T-42 grid to compare against the analytical formulation. Three test data sets were created with increasing sharpness in the sine wave profiles by spanning 1, 3, and 9 wavelengths across the domain. It was found that in all test cases, PDHM performed the best in the interpolation scheme, better than PPM. Regarding temporal accuracy, a linear, SISL 2-D dynamical model is given harmonic input for the dependent variables to extract a Von-Neumann analysis of the SISL numerical modifi- cation of the solution. The Boussinesq approximation is relaxed, and spatial error is removed in order to isolate only temporal accuracy. A hydrostatic switch is employed to invoke and remove non-hydrostatic dynamics. Trajectory uncentering (typically used to suppress spurious orographic SISL resonance) is included by altering the coefficients of the forcing terms of the linear equations. It was found that with regard to Internal Gravity Wave (IGW) motion, the

first-, second-, and third-order Adams-Moulton (AM) schemes performed with increasingly greater accuracy. Also, the higher the order of temporal convergence, the greater the gain in accuracy by simulating in a non-hydrostatic context relative to a hydrostatic one. Second-order uncentering resolves IGW phases poorly resulting in an RMSE error nearly the same as the

first-order scheme. The third-order AM scheme demonstrated superior accuracy to the other methods in this part of the study. Further research may determine if uncentering is necessary with this method for stability.

! ∀#∃%

!&∋ %( ∀

∀))

∃)%

∗++,

−−∃./01

2222222222222222222222222222 2222222222222222222222222222 3∀#3−4 353

2222222222222222222222222222 3∋43∀3 Dedication

I dedicate this foremost to God. Lord, You truly are my only deep satisfaction. I can’t believe how merciful You’ve been to me. I remember how cynical I used to be considering everything so meaningless, but You’ve brought light to that darkness. You give me a worth apart from any of my silly accomplishments or severe failures, a worth given in Christ apart from which

I would never be here today. You are altogether different than us, and You treat me so much better than I deserve! So thank You, Lord, and I hope this honors you. I dedicate this next to my wife, Shannon, whom I love more any other person. You have cared for me so well and stuck with me through all of the many anxiety attacks of academic

“expectation.” And I’m so thankful that you were patient with me while I spent this past sum- mer in Colorado. I would have had a nervous breakdown if it wasn’t for your encouragement and gentle kindness! I dedicate this to you, babe. You are a beautiful gift from God, and you reflect His radiance to me! “And if you’re waiting for love, it’s a promise I’ll keep if you don’t mind believing that it changes everything. Time will never matter.” Last, I dedicate this to my family. You’ve all treated me well and raised me well, and I really want to honor you all here. Thanks for investing all the time you have in my life. I’m grateful, and though I’m sure I did not express that very well while being raised, I want to express it now.

ii Biography

Matthew Ross Norman was born in Burlington, NC in September of 1983 subsequently mov- ing to Greenville, NC during grade school, middle school, and high school. He attended D. H. Conley High School and graduated in 2001 with an early interest in math and physics. After- ward, he attended North Carolina State University graduating with honors in May of 2006 with a B.S. in meteorology, a B.S. in computer science, and a minor in mathematics. He then began graduate studies at North Carolina State University for a M.S. degree in atmospheric science.

iii Acknowledgments

I would like to acknowledge and express much gratitude to Dr. Fredrick Semazzi (my advisory committee chair) for being an extremely good academic adviser and for guiding me into the topics through undergrad which eventually led to the present thesis. Dr. Semazzi has helped

explain a lot of difficult things with the semi-implicit, semi-Lagrangian scheme and how it is implemented. I would like to thank Dr. Matthew Parker (also serving on my advisory committee) for teaching the class on mesoscale modeling, MEA 712. That class was truly an awesome intro- duction into the innards of numerical modeling. On a similar note, I would also like to extend gratitude to Dr. Robert Walko at Duke University for their help in understanding some of the dynamical core of OLAM (Ocean, Land, and Atmosphere Model), a project under Dr. Roni Avissar. I appreciate the help of Dr. Jeffrey Scroggs as well for some helpful guidance, and for serving on my advisory committee.

I would certainly like to extend many thanks to Drs. Ramachandran Nair and Rich Loft at the National Center for Atmospheric Research (NCAR) and the Institute for Mathematics Applied to Geosciences (IMAGe) for the research opportunity and funding support this past summer which alone composes the first half of my thesis. Dr. Nair’s insight, explanations, and guidance were truly invaluable in helping me spin up on the topic of conservative semi- Lagrangian transport methods. Additionally, Dr. Peter Lauritzen gave me some helpful insight into some issues of the semi-implicit semi-Lagrangian discretization and the effects and need of trajectory uncentering.

Last, and by no means least, I would like to thank everyone in the Climate Modeling Lab- oratory for helping me with countless odds and ins and for helping me keep my sanity. It is a great place to work.

iv Table of Contents

ListofTables ...... ix

ListofFigures ...... xi

Part I: New Subgrid Approximations for the Conservative Cascade Scheme ...... 1

1 Introduction ...... 2 1.1 ReviewofSemi-LagrangianMethods ...... 2 1.2 ReviewofConservativeSLTransportMethods ...... 3 1.2.1 CascadeMethods ...... 7

1.3 NewSubgridApproximationfortheCCS ...... 8

2 Methodology ...... 13

2.1 ConservativeCascadeScheme ...... 13 2.1.1 1-DCell-IntegratedSLFramework...... 14 2.1.2 CascadeDimensionalSplitting ...... 22 2.1.3 GeneratingtheIntermediateGrid...... 28

2.1.4 GeneratingtheTargetLagrangianGrid ...... 30 2.1.5 1-DMeridionalSweep ...... 31 2.1.6 1-DZonalSweep ...... 33

v 2.1.7 CCSinSphericalCoordinates ...... 33

2.1.7.1 Transforming the Coordinate System: (λ,θ) (λ, µ) . . . . 34 → 2.1.7.2 MoreAccurateIntersectionCalculations ...... 35 2.1.7.3 Polar Cell Refinement and Polar Tangent Planes ...... 36

2.1.7.4 Local Tangent Planes for Zonal Boundary Calculations . . . . 38 2.1.7.5 TreatingthePolarCaps ...... 39 2.1.8 PositiveDefiniteFiltering...... 43 2.2 Sub-GridFunctionalApproximations...... 44 2.2.1 PiecewiseParabolicMethod(PPM) ...... 45

2.2.2 PiecewiseHyperbolicMethod(PHM) ...... 49 2.2.3 PiecewiseDoubleLogarithmicMethod(PDLM) ...... 61 2.2.4 PiecewiseDoubleHyperbolicMethod(PDHM) ...... 67 2.2.5 PiecewiseRationalMethod(PRM) ...... 70

2.3 Constructing the Piecewise Parabolic Method - Hybrid (PPM-H) ...... 73 2.3.1 PHMReplacementatPPMOvershoots ...... 73 2.3.2 ReplacementMethodsforExtrema...... 75 2.3.2.1 PDHMReplacementforExtrema ...... 77

2.3.2.2 PHMReplacementforExtrema ...... 79 2.3.2.3 AdaptiveUseofPHMforNewExtrema...... 81 2.3.3 ComputationalIntercomparison ...... 83 2.4 AdvectionTestCases ...... 85 2.4.1 1-DTestCases ...... 86

2.4.1.1 InitialData ...... 86 2.4.2 2-DCartesianTestCases ...... 88 2.4.2.1 Transport...... 88 2.4.2.2 InitialData ...... 90

vi 2.4.3 2-DSphericalTestCases ...... 92 2.4.3.1 Transport...... 92 2.4.3.2 InitialData ...... 95 2.5 TheErrorNorms ...... 96

2.6 CompraisonwithaModernScheme ...... 99

3 Results...... 102

3.1 SpatialApproximationPerformances ...... 102 3.1.1 1-D ...... 102 3.1.1.1 Semi-LagrangianCCS ...... 102 3.1.1.2 EulerianWRF ...... 111 3.1.2 2-DCartesian ...... 115

3.1.3 2-DSpherical ...... 124

4 ConclusionsandFutureWork ...... 132

5 FurtherApplications ...... 134 5.1 ApplicationtoConservativeInterpolation ...... 134

5.1.1 TheConservativeInterpolationProcedure ...... 135 5.1.2 SineWaveTestCase ...... 137 5.1.3 ResultsandConclusions ...... 139

Part II: Investigating Higher-Order Semi-Implicit Semi-Lagrangian TemporalAccuracy ...... 142

6 Introduction ...... 143 6.1 Semi-Implicit Semi-Lagrangian (SISL) Methods ...... 143 6.2 Examining SISL performance for Gravity Waves ...... 144

vii 7 Methodology ...... 148 7.1 ModelEquations ...... 148 7.2 The Semi-Implicit Semi-Lagrangian Discretizations ...... 150 7.2.1 Two-TimeStepMethods ...... 150

7.3 RemovalofSpatialError ...... 156 7.4 RelaxationoftheBoussinesqApproximation ...... 157 7.5 AnalyticalSolutions...... 158 7.6 Extracting Intrinsic Amplification from the Numerical Solutions ...... 162 7.7 ObtainingTotalTemporalErrorMeasures ...... 164

7.8 UncenteringMethods ...... 169

8 Results...... 172

8.1 AM-1Results:ImplicitEulerMethod ...... 173 8.2 AM-2Results:TrapezoidalMethod ...... 178 8.3 AM-3Results ...... 181 8.4 UncenteringResults...... 187

8.4.1 First-OrderResults ...... 187 8.4.2 Second-OrderResults...... 193 8.4.3 Intercomparison...... 196

9 ConclusionsandFutureWork ...... 199

Bibliography...... 202

viii List of Tables

Table 2.1 Conversions to and from polar tangent coordinates for North and South Pole...... 38

Table 2.2 Operation counts for the construction of the approximation functions. Bound refers to the operation counts in the reconstruction ofboundaries. 84 Table 2.3 Operation counts for one integration procedure. a/b in the PHM row gives the operation counts: (if alpha < tol) / (if alpha > tol)...... 84 Table 2.4 Array input values for (2.87) for the steep gradient (SG) and irregular

signal(IS)initializationprofiles ...... 88

Table 3.1 L2 error norms and orders of convergence for square wave...... 105

Table 3.2 L2 error norms and orders of convergence for triangle wave...... 106

Table 3.3 L2 error norms and orders of convergence for sine wave...... 106

Table 3.4 L2 error norms and orders of convergence for steep gradient profile. . . . 106

Table 3.5 L2 error norms and orders of convergence for irregular signal profile. . . 106 Table 3.6 1-D 10-run average CPU time and standard deviations for 1,000 cell sine wave problem with 4,000 time steps (for 2 revolutions). Units in

seconds. The suffix “Reg” means the scheme was run by with a regu- lar mesh boundary value reconstruction (which is much more efficient). These were performed with intel fortran compiler options “-c -O3 -axT”. 110

ix Table 3.7 1-D 10-run average CPU time and standard deviations for 1,000 cell sine wave problem with 4,000 time steps (for 2 revolutions). Units in seconds. The suffix “Reg” means the scheme was run by with a regu- lar mesh boundary value reconstruction (which is much more efficient).

These were performed with intel fortran compiler options “-c-fast”. . . . 110 Table 3.8 Error norms for cosine hill solid body rotation experiment. SLICE and SLICE-MaredefinedbyZWS05 ...... 118 Table 3.9 Error norms for slotted cylinder solid body rotation experiment. SLICE andSLICE-MaredefinedbyZWS05 ...... 120

Table 3.10 Error norms for Leveque data solid body rotation experiment...... 123 Table 3.11 Error norms for the spherical cosine hill solid body rotation experiment. 126 Table 3.12 Error norms for the smooth deformational flow experiment...... 131

Table 5.1 Tabulation of the L1 error norms for conservative interpolation of three

sets of data (Nλ =1, 3, & 9) for the five methods of this study. Nλ rep- resents the number of wavelengths spanned zonally and meridionally acrosstheglobaldomain...... 141

Table 7.1 Forcing coefficients for Adams-Moulton schemes of first- through third-

orderasappliedto(7.8)...... 152

x List of Figures

Figure 2.1 1-D CISL schematic of Eulerian and Lagrangian boundary and den- sity approximation definitions. Black rectangles represent the cell den-

sity means, ρ¯i, black rectangle interfaces represent Eulerian boundaries,

xi 1/2, dashed red lines represent Lagrangian boundaries, xi∗ 1/2, and ± ± dashed black lines represent the approximations to the subgrid density,

ρ˜i. U indicates the direction of wind flow, and the solid red arrows rep- resent the backward trajectory tracing of the Lagrangian boundaries. . . 18 Figure 2.2 Schematics of 1-D CISL remapping procedures. Black lines represent

Eulerian boundaries, red dashed lines represent Lagrangian boundaries, gray shading represents the Eulerian arrival cell, and red shading repre- sents the Lagrangian departure cell. U indicates the direction of wind flow...... 20

Figure 2.3 Schematic of 2-D pointwisecascade interpolation. The rectangular black grid represents the Eulerian grid and the curvilinear red grid represents the Lagrangian grid. Filled black squares represent Eulerian grid points, filled black circles represent intermediate grid points, and filled red cir- clesrepresentthetargetLagrangianpoints...... 24

xi Figure 2.4 Schematic of conservative cascade scheme (CCS). The thin, black, rect- angular grid represents the Eulerian boundaries and the thin, red, curvi- linear grid represents the Lagrangian boundaries where the intersections of these boundaries form the corners of each cell. Thick, dark, blue

slashes are the intersections of the Lagrangian latitudes with Eulerian longitudes, and the green dashed lines represent the North-South inter- mediate cell boundaries. Finally, the light blue dashed lines represent the East-West Lagrangian boundaries. In this example, the shaded inter- mediate row is used to calculate the mass in the target Lagrangian cell

A∗B∗C∗D∗ which will be remapped to its corresponding Eulerian arrival cell ABCD...... 27

Figure 2.5 Schematic of the (λ, µ) grid...... 35 Figure 2.6 Schematic of Eulerian and Lagrangian polar region. Thin, solid, black

circles represent Eulerian latitudes and thin, dashed, black lines repre- sent Eulerian longitudes. The thick, solid, red ellipse and thick, red, dashed lines represent the Lagrangian latitudes and longitudes respec- tively. The red shaded region is the Lagrangian polar cap, the red dot is

the Lagrangian pole, and the black dot is the Eulerian pole...... 41 Figure 2.7 Plot of a monotone Hermite and a cubic Lagrange interpolant fit to a discontinuousjumpfromzerotoone...... 43 Figure 2.8 Schematic of PPM undershoot and the result of monotonic limiting. The thin, dashed, black line represents the left cell’s average which the cen-

ter cell’s approximation may not exceed if monotonicity is to be main- tained. The red line represents the monotonically limited PPM approxi- mation...... 46

xii Figure 2.9 Output of PPM representations of subgrid distribution for irregular sig- nal. Black boxes are the actual cell means, and red lines are the piece- wiseparabolasfittothemeans...... 49 Figure 2.10 Output of PHM representations of subgrid distribution for irregular sig-

nal. Black boxes are the actual cell means, and red lines are the piece- wisehyperbolasfittothemeans...... 56 Figure 2.11 Values for the ratio of the power limiter value / min value for a wide range. X-axis is displayed with a log-scaling. The black, red, blue, and green lines represents the plot of exponents of 3, 3.5, 3.9, and 4.0

respectively...... 59

Figure 2.12 L2 error norms for varying PHM exponents on a grid of 80 Cells . . . . . 60 Figure 2.13 PDLM representation of irregular signal with 20 cells. The black boxes are the cell means, and the red lines are the PDLM approximations. . . . 66

Figure 2.14 PDLM representation of irregular signal with single precision calcula- tions of integrated means. The black boxes represent the actual means, and the red lines represent the single precision PDLM approximations to those means. The large errors are caused by floating point arithmetic

problems...... 67 Figure 2.15 PDHM representation of irregular signal with 20 cells...... 69 Figure 2.16 PRM representation of irregular signal with 20 cells ...... 72

Figure 2.17 Surface and contour plot of L2 error norms for PPM-H using PDHM to resolvelocalextrema...... 78

Figure 2.18 Surface and contour plot of L2 error norms for PPM-H using PHM to resolvelocalextrema...... 80

xiii Figure 2.19 Exaggerated schematics of new extrema created because of overshoots at discontinuous jumps. The black boxes represent the mean within a cell, the dashed red lines represent the derivatives at the interfaces be- tween boxes across the jump, and the bluedashed lines showa schematic

of the approximate hyperbolic fitting for cells i and i + 1. The wind is assumed to be blowing in the negative x direction...... 82 Figure 2.20 Plots of the initial conditions. The square wave is in black, the triangle wave is in green, the sine wave is in dark blue, the steep gradient is in violet,andtheirregularsignalisinlightblue...... 89

Figure 2.21 Initial profiles for 2-D Cartesian framework ...... 93 Figure 2.22 Contour plots of initializations for spherical geometry...... 97

Figure 3.1 Comparison of5 different spatialschemes. Notethat the domains extend from 0 to 1 in all cases, but for plotting clarity, subsets are plotted. All

are run with a uniform wind speed of 1 m/s and a Courant number of 0.5 (meaning ∆t = 1/(2n))...... 104 Figure 3.2 Plot of WRF spatially third-order results with Eulerian RK-3 integration in time. Note that the domains extend from 0 to 1 in all cases, but for

plotting clarity, domain subsets are plotted. All are run with a uniform

wind speed of 1 m/s and a Courant number of 0.5 (meaning ∆t = 1/(2n)).112 Figure 3.3 Plot of WRF spatially fourth-order results with Eulerian RK-3 integra- tionintime. Notethatthe domainsextendfrom 0 to1in allcases, but for

plotting clarity, domain subsets are plotted. All are run with a uniform

wind speed of 1 m/s and a Courant number of 0.5 (meaning ∆t = 1/(2n)).113

xiv Figure 3.4 Plot of WRF spatially fifth-order results with Eulerian RK-3 integration in time. Note that the domains extend from 0 to 1 in all cases, but for plotting clarity, domain subsets are plotted. All are run with a uniform

wind speed of 1 m/s and a Courant number of 0.5 (meaning ∆t = 1/(2n)).114

Figure 3.5 Cosine cone solid body rotation. Surface plotted after 1 rotation, nx = 5 2 5 1 ny = 33, nt = 71, Ω = 0,32 10 m , ωr = 10 s , ∆t = 2π∆x/nt. × − − The x- and y- axes have the units 105m...... 117 × Figure 3.6 Cosine cone solid body rotation. Surface plotted after 1 rotation, nx = 2 ny = 101, nt = 96, Ω =[0,100] , ωr = 2π/(nt∆t), ∆t = 1800s. The x- and y- axes have the units m...... 119 Figure 3.7 PPM: 1 revolution of Leveque data solid body rotation. See text for experimentspecifications...... 122 Figure 3.8 PDHM: 1 revolution of Leveque data solid body rotation. See text for

experimentspecifications...... 122 Figure 3.9 PPM-H: 1 revolution of Leveque data solid body rotation. See text for experimentspecifications...... 123

Figure 3.10 Contour plots of the results of spherical polar (αr = π/2 0.05) advec- − tionofacosinehilloverthesphere...... 125

Figure 3.11 L1 and L∞ norm plots for PPM and PPM-H polar solid-body rotation of acosinehillonthesphere...... 128

Figure 3.12 Contour plots of the results of spherical quasi-polar (αr = π/2 0.05) − advectionofacosinehilloverthesphere...... 129

Figure 3.13 Contours of smooth quasi-polar deformational flowexperiment...... 130

xv Figure 5.1 Schematics demonstrating the process of conservative interpolation on regular rectangular grids using the CCS. Gray shading denotes the fitting of approximations to data oriented along black arrows. Pink shading denotes the integration over approximations oriented along red arrows. . 136

Figure 5.2 Contour plots of sine wave data on the sphere. Both plots are from the same perspective, and the integrated mean densities were calculated on a grid with 64 cells in the meridional direction and 128 cells in the zonal direction...... 140

Figure 7.1 Plots of hypothetical analytical and numerical functions given by (7.60) and(7.61)respectively...... 166 Figure 7.2 RMSE between (7.60) and (7.61) as a function of time. The x-axis de-

notes the number of error periods, τE over which the RMSE is integrated. 168

Figure 8.1 Solution amplitudes for AM-1 simulations. H (NH) stands for a hydro-

static (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave solutionpropagatingdownward...... 174

Figure 8.2 Solution frequencies for AM-1 simulations. H (NH) stands for a hydro- static (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Negative modes have

been reflected about ω = 0 for direct visual comparison with positive modes...... 176

xvi Figure 8.3 Numerical errors for AM-1 simulations. H (NH) stands for a hydro- static (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave

solutionpropagatingdownward...... 177 Figure 8.4 Solution amplitudes for AM-2 simulations. H (NH) stands for a hydro- static (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave

solutionpropagatingdownward...... 179 Figure 8.5 Solution frequencies for AM-2 simulations. H (NH) stands for a hydro- static (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Only positive modes shown.180

Figure 8.6 Relative numerical errors for AM-2 simulations. H (NH) stands for a hydrostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a so- lution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a grav-

itywavesolutionpropagatingdownward...... 182 Figure 8.7 Solution amplitudes for AM-3 simulations. H (NH) stands for a hydro- static (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave

solutionpropagatingdownward...... 184

xvii Figure 8.8 Solution frequencies for AM-3 simulations. H (NH) stands for a hy- drostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a solu- tion with (without) the Boussinesq approximation. Negative modes have

been reflected about ω = 0...... 185 Figure 8.9 Relative numerical errors for AM-3 simulations. H (NH) stands for a hydrostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a so- lution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a grav- itywavesolutionpropagatingdownward...... 188

Figure 8.10 Numerical amplitudes for uncentered AM-1 simulations. H (NH) stands for a hydrostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive de- notes a gravity wave solution propagating upward, and negative denotes

agravitywavesolutionpropagatingdownward...... 190 Figure 8.11 Numerical frequencies for uncentered AM-1 simulations. H (NH) stands for a hydrostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive de-

notes a gravity wave solution propagating upward, and negative denotes agravitywavesolutionpropagatingdownward...... 191 Figure 8.12 Numerical errors for uncentered AM-1 simulations. H (NH) stands for a hydrostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a so- lution with (without) the Boussinesq approximation. Positive denotes a

gravity wave solution propagating upward, and negative denotes a grav- itywavesolutionpropagatingdownward...... 192

xviii Figure 8.13 Numerical amplitudes for uncentered AM-2 simulations. H (NH) stands for a hydrostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive de- notes a gravity wave solution propagating upward, and negative denotes

agravitywavesolutionpropagatingdownward...... 194 Figure 8.14 Numerical frequencies for uncentered AM-2 simulations. H (NH) stands for a hydrostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive de- notes a gravity wave solution propagating upward, and negative denotes

agravitywavesolutionpropagatingdownward...... 195 Figure 8.15 Numerical errors for uncentered AM-2 simulations. H (NH) stands for a hydrostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a so- lution with (without) the Boussinesq approximation. Positive denotes a

gravity wave solution propagating upward, and negative denotes a grav- itywavesolutionpropagatingdownward...... 197 Figure 8.16 Relative numerical error comparison for AM-1, AM-2, AM-3, and first- and second-order uncentered NH Boussinesq simulations...... 198

xix Part I New Subgrid Approximations for the Conservative Cascade Scheme Chapter 1

Introduction

1.1 Review of Semi-Lagrangian Methods

When modeling dynamical flows numerically by discretizing partial differential equations (PDEs) in time and space, the maximum time step is constrained by the fastest propagating wave speed because of a restricted domain of dependence. This constraint is the well known

Courant-Friedrichs-Lewy (CFL) limit which defines the conditions of convergence of algebraic approximations to hyperbolic PDEs over a spatial mesh (Courant et al., 1928). Considering a simple case of advection of density (ρ) by a constant wind (u), ∂ρ/∂t = u(∂ρ/∂x), explicit − Eulerian methods discretize this PDE on a static mesh to calculate the local change in density as the fluid moves by. Such methods must limit the time step such that u∆t C∆x (where ≤ C represents the Courant number and the maximum stable value is usually of order unity for typical second-order centered differences) if they are to converge to the correct solution and remain stable throughout the integration. The main problem with this constraint is that the temporal truncation error (analyzing the error terms of the truncated Taylor series) for explicit

Eulerian methods is much smaller than the spatial truncation error. Therefore, efficiency could theoretically be greatly increased because were the time step limited by accuracy rather than

2 stability, it could be increased with no appreciable increase in error (Staniforth and Cote, 1991). Theoretically, the model would be most efficient if the truncation error were roughly equivalent in both time and space. A semi-Lagrangian (SL) approach to the advection problem avoids restriction of time step because there is no longer a domain of dependence in which the propagating wave (advection in this case) must be present for each time step. SL methods formulate the advection in a full Lagrangian manner, Dρ/Dt = 0, and calculate the time integration following fluid motion. Some SL methods are implemented with the static mesh defined at the future time and trace the scalars from the static grid points upstream to the departure locations (forming a deformed

Lagrangian mesh), and these are called backward trajectory methods. Others are implemented with the static mesh defined at the current time and trace the scalars downstream to the arrival locations, and these are called forward trajectory methods (Nair et al., 2003 and Leslie and Purser, 1995). Essential to both methods is the need for interpolation. In the backward tra-

jectory formulation, the scalar values at the departure locations must be interpolated from the Eulerian mesh. In the forward trajectory formulation, the scalar values at the Eulerian mesh locations must be interpolated from the arrival values. This work adopts a backward trajectory formulation throughout; therefore, that implementation is assumed in any further discussion.

Thus, the SL methods used in this study consist roughly of three sequential parts: (1) cal- culation of the parcel departure locations, (2) interpolation of the scalar values at departure locations, and (3) assigning of the scalar values to the future time on the Eulerian grid.

1.2 Review of Conservative SL Transport Methods

Composing the majority of Global Circulation Models (GCMs) and regional models as well are two main parts: the dynamical core and the physics. The dynamical core is responsible for the dry dynamics of the model roughly characterized by some combination of simplifications

3 to the primitive equations which conserve mass, momentum, and thermodynamic energy. The physics include moist processes such as convection and rainfall, chemical processes such as stratospheric and tropospheric ozone reactions, radiative processes and greenhouse gas inter- action, and a wide array of others. It seems that atmospheric modelers are encountering with increasing frequency the need for a dynamical core to posses certain properties not only to render an accurate solution to the dry dynamics equations but also to provide a sound input for physics parametrization schemes. The SL method has been used in a number of opera- tional forecast model contexts (Benoita et al., 1997; Tanguay et al., 1990; Ritchie et al., 1995); however, until recently operational SL methods have lacked many of these very important prop- erties: conservation, positive definiteness, and monotonicity. The focus of this study is on what is considered widely to be the primary testbed for any new numerical scheme of time integra- tion of the primitive PDEs: the scalar advection problem. Therefore, while these properties are described immediately below in the more general context of a full GCM dynamical core, they arise from and are manifested in passive scalar transport problems. For example, when integrating the primitive equations (not in flux form), if one assures strict mass conservation in the continuity equation, then mass is conserved entirely. Positive definiteness is the most obvious need because for almost all scalar quantities such as moisture variables and chemical concentrations, negative values are physically meaning- less. Schemes that produce negative values for positive definite quantities that are large and frequent enough (such as the well-known spectral Gibb’s phenomenon arising from the trun- cation of small wave numbers) can make enforcing positive definiteness a difficult task. For instance, Royer (1986) reviews some methods of enforcing positive definiteness on mixing ra- tio, a variable used extensively in parametrizing moist processes such as cumulus convection and autoconversion of rainfall from vapor since negative values cannot be tolerated in many parametrization schemes. The most simple nontrivial method of enforcing this condition is to simply set all negative scalar quantities to zero and try to “borrow” as much as possible locally

4 either from the vertical column or neighboring cells Holloway and Manabe (1971). As Royer notes, however, there is often not enough moisture in the vertical column to compensate and spurious sources of moisture originate which add up over time to significant levels on the order of well-known physical sources. Even horizontal borrowing techniques (Gordon and Stern,

1982; Williamson, 1983) though they come much closer to conserving of the global mixing ratio integral still have little physical justification. Royer’s global fix algorithms also produce excessive smoothing in the horizontal. There exist methods such as global multiplicative hole fillers which borrow from plentiful locations and relocate to negative values to conserve the global mass integral Rood (1987); however, these schemes result in artificial global transport.

Therefore, it is better if the numerical scheme does not yield negative values in the first place which is almost always coupled with the idea of monotonicity in disallowing the production of new extrema altogether. However, in the absence of inherent positive definiteness as is en- countered in this study, it should be demonstrated by a satisfactory transport scheme that these negative values are small enough to have negligible effects. Also important in a numerical scheme is the property of conservation which was roughly touched on in the preceding paragraph. Not only is this important for the physical accuracy of parametrization as noted above, but it has been found a little more recently that the conservation of mass is very important for consistency in flux-form primitive equation and scalar advection formulations (Lauritzen, 2005). Flux-form advection of a scalar calculates the advection of the product of the scalar and the surrounding control volume mass (or pressure or density in most implementations). As reviewed in Lauritzen (2005), when mass is not conserved locally in the dry dynamical equations, there develop inconsistencies between the surface pressure tendency and the winds and between the pressure change used to advect the winds in a scalar transport scheme and the pressure change used to extract the scalar from its flux form. Jockel et al. (2001) shows that all known fixes for this problem at the time either fail to adequately preserve the shape of the scalar or introduce unphysical modes in the transportation. However,

5 Lauritzen notes that if the same method is used both in the scalar transport scheme and in the dynamical core, then these inconsistencies do not arise. Yet this practically requires that the dynamical core be mass conserving since such a property is necessary in the transport scheme. Therefore, mass conservation is considered one of the more valuable properties of a numerical scheme for solving the PDEs in a GCM. In this study, conservation, positive-definiteness, and monotonicity are all considered to be very high priorities and are analyzed in detail throughout. To the author’s knowledge, Lauritzen et al.(2006a and 2006b), (Zerroukat et al., 2004), and (Zerroukat et al., 2007) are the only operational formulations of a SL method that guarantee these properties. The Lauritzen et al. works are formulated based on the cell-integrated SL

(CISL) method and the conservative cascade scheme (CCS) developed in Nair and Machen- hauer (2002) and Nair et al. (2002) respectively. The Zerroukat works are based on a conser- vative semi-Lagrangian scheme called SLICE (Zerroukat et al., 2002, Zerroukat et al., 2004, and Zerroukat et al., 2005). Typically, pointwise SL methods do not conserve mass or preserve scalar monotonicity, and a typical fix for this problem in general has been to add mass periodi- cally back to the domain (Gates et al., 1971; Priestley, 1993; and Gravel and Staniforth, 1994). In these mass restoration algorithms, the pressure gradients are rarely changed meaning there is no sudden modification to the flow, but mass is not conserved locally. Whatever mass is lost locally is redistributed evenly across the rest of the grid points to ensure conservation of the global integral of mass. There do exist SL schemes which conserve mass without the need for mass restoration em- ploying such techniques as conservative cascade interpolation and cell-integrated techniques. Leslie and Purser (1995) achieved mass conservation in 3-D Cartesian geometry by using a conservative form of interpolation within a cascade framework where the interpolator approxi- mates mass at a given location and is then differentiated to yield the resulting scalar value sim- ilar to the boundary value reconstruction of Colella and Woodward (1984). Rancic (1992) also created a fairly computationally expensive conservative SL scheme in 2-D Cartesian geometry

6 by fitting a 2-D biparabolic function to the control cells and integrating in 2-D to perform a conservative remapping. Later, Rancic (1995) introduced a conservative SL remapping algo- rithm applied to 2-D Cartesian geometry using a cascade dimensional splitting method which proved to be more efficient than the biparabolic fully 2-D integrated approach. In this study, the piecewise parabolic method (PPM) (Colella and Woodward, 1984) was used to approxi- mate the scalar distribution within control cells. Laprise and Plante (1995) developed a slightly more general method similar to Rancic (1995) with forward and backward trajectory variations. However, as pointed out in Nair and Machenhauer (2002), none of these schemes are readily applicable to spherical geometry because of the well-known problem of meridians converging to a singularity at the poles. A few conservative SL advection schemes have been implemented that are applied to spherical geometry; however, most of them are relatively computationally expensive and all of them suffer the time step restriction in keeping the meridional Courant number at or below unity, Cθ 1. ≤

1.2.1 Cascade Methods

Of particular importance to this study is the method employed by Nair et al. (2002) (hereafter NSS02) for performing a conservative SL transport in 2D spherical geometry. The basic frame- work of the scheme employs a technique known as cascading which is a form of dimensional splitting without the excessive computation of a more basic tensor product. This technique was introduced to the high-order interpolation of SL values and is well-explained by Purser and Leslie (1991) where it was found that the cascading approach was 2.9, 6.1, and 10.2 times faster than the Cartesian product when applied to fourth-, sixth-, and eighth- order accurate

Lagrange interpolators. This rendered higher-order SL interpolation a much more attractive possibility than had been previously thought with its infeasible computational burdens. In fact, the order of complexity in computation decreases from O p3 to O(p) by switching from a 

7 Cartesian product interpolation to a cascade interpolation where p is the order of the interpo- lating polynomial. Within the conservative adaptation of this cascade framework, in each one-dimensional sweep, approximating functions are fit to each control volume. Then, the control volume boundaries are traced upstream to locate their departure points. Then a conservative remap- ping via integration over subgrid approximating functions (parabolas in the Nair et al., 2002 study) is performed during each cascade sweep ensuring mass conservation and also mono- tonicity as it turns out (a product of using PPM). The spherical geometry was handled in a similar manner as in Nair and Machenhauer (2002) (hereafter NM02) wherein the meridional

Courant number was restricted to be less than 1; and with the exception of the polar regions, the numerics were computed on a (λ, µ) grid where λ is the longitude, µ = sin(θ), and θ is the latitude. NM02 mentioned extension of the spherical method to larger Courant numbers which is described in much greater detail in Nair (2004). The polar caps (defined in NM02) are

treated by redistributing the total polar cap mass based on weights that are computed via cubic Lagrange interpolation of the Lagrangian values (in other words, at the departure points). In the cells near the polar cap, a refinement was performed in the µ direction to increase accuracy where large grid distortion occurs. Both the cascading and the spherical application will be

described in great detail later in this thesis. Zerroukat et al. (2002) also developed a similar algorithm utilizing piecewise cubics instead of the piecewise parablas used in NSS02.

1.3 New Subgrid Approximation for the CCS

Regarding the conservative cascade scheme briefly described earlier, the desired property of

scalar conservation necessitates two conditions. First, the integrated mass across each control volume of the functions that approximate the subgrid distributions must match the pre-existing mass values defined for the respective cell. Second, the Lagrangian grid must exactly and

8 uniquely span the entire physical domain without overlapping any of the deformed control vol- umes. In practice the deformed, Lagrangian grid arising from the Eulerian boundaries being traced upstream must either be exactly cyclic (as is required in the doubly cyclic Cartesian case) or must end at precisely the locations where another scheme then ensures mass conser- vation on the non-spanned parts of the domain (as is required in spherical geometry with polar caps). Also, monotonicity and positive definiteness are completely dependent upon the ap- proximating functions. Therefore, since any function may be used to approximate the subgrid distribution within control volumes, the first goal of this study is to develop local, mass con- serving, efficient, and accurate approximations of the subgrid variation during the conservative cascade remapping. For preliminaries, the PPM will be described and implemented in 1-D, 2-D Cartesian, and 2-D spherical geometries following the method of NSS02 for comparison with the newly de- veloped approximations. The new approximations must first be adapted into the CCS scheme and spherical geometry prior to implementing the new schemes. Specifically, three new non- polynomial based methods will be adapted to the CCS in this paper. All of these new methods being introduced to the CCS were previously developed for other schemes mainly the Eulerian Godunov-type context in the setup of a mass conserving Riemann problem. They require adap- tation to the CCS context similar to PPM’s adaptation to the CCS NSS02. There do exist other implementations of the use of parabolas such as the use of parabolic splines (Zerroukat et al., 2006), but splines require a global calculation (as the name might hint) because the interface derivatives are equated to yield a global matrix solve. This study is focused on local methods only.

First, piecewise hyperbolas will be used, adapted from Marquina (1994) in a third-order method called the piecewise hyperbolic method (PHM). An updated application of this piece- wise hyperbolic reconstruction given in Serna (2006) is used wherein a power limiter was used as opposed to the harmonic limiter from Marquina (1994). The power limiter was shown to be

9 total variation bounded (TVB) and more accurate especially in the presence of jump discon- tinuities. Serna called this method power-PHM, and we will be using the “power-” prefix to denote the use of the power limiter as opposed to the use of the harmonic limiter of Marquina (1994).

Second, third-order piecewise double logarithmic functions are implemented following Ar- tebrant and Schroll (2006) (AS06) which is based on an earlier work initially investigating log- arithmic approximation Artebrant and Schroll (2005) (AS05). This method in a CCS context is hereafter referred to as the piecewise double logarithmic method (PDLM). The justification for not using the fifth-order double logarithmic approximation as introduced in AS05 is that the authors state that “DLR preprocessing does not ensure a well-defined reconstructing func- tion.” Additionally, the third-order single logarithmic approximation presented in AS05 is not suitable for CCS application because the derived algorithm did not produce a well-defined and integrable function across the entire control volume. AS05 implemented the algorithm in an

Eulerian finite volume context in which only the boundary fluxes are necessary, and any sin- gularity was analytically removed at that location. AS06 presented a method better suited for CCS application, the third-order double logarithmic reconstruction, in which singularities are prevented via a tolerance.

Thirdly, third-order piecewise double hyperbolic functions are implemented and tested taken also from AS06 in an appended derivation. This method is not very similar to Mar- quina’s and Serna’s hyperbolic methods as it is derived very similarly to the third-order double logarithmic method with exactly the same tolerance in a variable that poses threat of singular- ity. This method in the CCS context is hereafter referred to as the piecewise double hyperbolic method (PDHM). Finally, a scheme based on the ratio of two parabolas called the piecewise rational method (PRM) developed in Xiao et al. (2002) is implemented in the CCS framework using the same interface values as PPM.

10 After implementing and comparing these methods against PPM, it was observed that PPM though normally a very accurate third-order method (superior to the others in fact) suffers de- generation of accuracy in the presence of steep gradients and local extrema. In fact at local extrema, to ensure monotonicity, PPM is typically forced to be a piecewise constant equal to

the original scalar mean of the respective control volume rendering it only first-order accurate. Observing this fact and the properties of PHM and PDHM, a hybrid PPM-PHM method, here- after referred to as PPM-Hybrid (PPM-H) was developed. In PPM-H, both PHM and PDHM were tested for resolving local extrema, but it was found that there was no robust improvement of accuracy, and PPM’s natural first-order monotonic constraint was kept. However, PHM

was found to be an excellent replacement at steep jumps because of its ability to handle steep gradients without excessive overshoots. Also, PHM was used to replace the PPM overshoot occurrences with great success. Thus, PPM-H uses PPM for smooth data and local extrema but uses PHM at steep jumps.

One very important thing to note is that all of the new methods with the exception of PRM only require a three-cell stencil where PPM requires a four-cell stencil. A stencil represents the total number of cells required to construct the approximating function. Since PHM, PDLM, and PDHM only require the derivatives at the interfaces, they need only three cells for the needed computations. PPM and PRM both need four cells because the boundary values are re- constructed via a conservative, monotonic, cubic function which requires four cells to compute. The stencil size is possibly even more important than the computational speed of a method be- cause in modern parallel architecture, communication is always the most expensive component in terms of running time. Parallel architecture may be implemented whenever a task consists of independent subtasks, and this is certainly the case in constructing the approximating functions because all that is needed are the cell averages. Suppose a 2-D domain is decomposed by splitting the columns and we focus the attention on two adjacent columns which are divided from one another (meaning they are logically adja-

11 cent yet stored in the memory of different nodes on a cluster). For a given row, if each cell on either side of the boundary only requires information from the neighboring cells, two commu- nications are necessary across this boundary for that row (corresponding to a send and receive for each processor). However, for PPM, three communications would be necessary because of the four-cell stencil during boundary value reconstruction. Thus, the cost of communication is significantly greater. A second advantage of the smaller stencil arises in the context of a vertical remapping where one cannot go below the ground (in which case a less accurate linear remapping is used for the lowest cell). PPM, requiring more surrounding cells, would repre- sent more of a problem with the ground cells where PHM requires one cell less. Thus, some of the new methods do hold one particular advantage over PPM as their construction is more efficiently implemented in a parallel architecture.

12 Chapter 2

Methodology

2.1 Conservative Cascade Scheme

The conservative cascade scheme (CCS) is a method of breaking a multiple-dimensional prob- lem in space into a collection of 1-D problems (this is known as the cascade approach, one instance of the more general concept of dimensional splitting). Therefore, the CCS is insep- arably tied to a 1-D implementation of a class of finite volume SL methods known as the cell-integrated SL (CISL) method. It is this CISL method which gives the CCS its conser- vation properties, and thus this 1-D framework will be described first. Then, the concept of cascading itself – how it obtains its efficiency, how it decomposes a 2-D into a series of 1-D tasks, and how it can be formulated in a finite-volume matter – is explained. Then, further emphasis is given to explaining the low-level implementation details of the grid generation, the 1-D sweeps, and the remapping. Finally, the CCS is extended into spherical geometry by means of an alternate coordinate system, and the handling of the polar singularities is described in great detail. To summarize the process of cascade interpolation:

Generate the intermediate grid •

13 Perform a 1-D sweep from the source grid to the intermediate grid •

Perform a second 1-D sweep from the intermediate grid to the final (target) grid •

2.1.1 1-D Cell-Integrated SL Framework

Typically, a backward trajectory SL method is based on point-wise interpolation at the depar-

ture points from the Eulerian grid at the previous time. However, as mentioned in the introduc- tion there exist different methods of treating SL advection within the finite volume paradigm. Here, the cell-integrated method of finite volume SL advection as used in NSS02 called the cell integrated semi-Lagrangian (CISL) method will be described. The most immediate difference

is that in this context, each basic value on the grid is to be interpreted as the average scalar value across a given interval and not as a pointwise value. These intervals will be referred to as cells, and in one dimension there exist two groups of cells: Eulerian cells and Lagrangian cells. For convenience, from this point on the passive scalar being advected will be considered to be density (mass per unit volume). This volume averaged interpretation of the grid values readily

accommodates an inherently mass conserving (IMC) scheme because the total mass within a specific cell (i.e. the local mass) is defined as the integrated density across the volume:

Mi = ρdV (2.1) ZVi

Therefore, the total mass in thedomain is the sumof all of the integrated control volume masses as long as the control volumes span the full physical domain without overlap.

MT = ∑ ρdV (2.2) i ZVi Now, let us focus our attention on 1-D transport of density such as in the continuity equa-

14 tion. Equation (2.3) give the continuity equation in 1-D in the absence of sources or sinks.

∂ρ ∂ρu = (2.3) ∂t − ∂x

This familiar equation if it is to be treated in a conservative SL manner must be cast in inte- grated Lagrangian form by integrating over a time-dependent control volume that is moving with the fluid. The derivation is described in detail in Laprise and Plante (1995) and will be summarized here. Integrating 2.3 spatially over a time-dependent control volume with left and right boundaries A(x,t) and B(x,t) respectively which are moving and deforming perfectly with the fluid yields (2.4).

B(x,t) ∂ρ B(x,t) ∂ρu dx = dx (2.4) A x t ∂t − A x t ∂x Z ( , ) Z ( , ) Then, applying Leibniz rule to 2.4 with functional boundaries, one obtains the relationship given in (2.5).

d B(x,t) dB dA ρdx ρ (B,t) ρ (A,t) +(ρ (B,t)u(B,t) ρ (A,t)u(A,t)) = 0 (2.5) dt A x t − dt − dt − Z ( , )  

In (2.5), however, since the boundaries move with the flow, dB/dt is equivalent to u(B,t) and dA/dt is equivalent to u(A,t) causing a cancellation which yields a strict conservation of the volumetric integral. Stating this formally in more general terms, the following is true:

d ρ (x,t)dV = 0 (2.6) dt ZV(t) It is easy to more intuitively demonstrate that Equation ((2.6)) which is also called the in- tegral form of the continuity equation is consistent with equation (2.3). Consider the Reynolds

15 transport theorem in which the change of mass inside a moving control volume is equivalent to the sum of the local change within the control volume and the flux of mass through the boundaries. The mass flux through the boundaries becomes zero since the boundary expands and deforms perfectly with the fluid motion allowing no fluid motion through the boundaries rendering a conserved quantity inside the cell in the absence of local sources or sinks. The time discretization of ((2.6)) is essentially just a forward Euler step following parcel trajecto- ries since with no forcing there is no advantage in a higher order temporal discretization. In this study, backward trajectories were used, and the target grid at the future time step is considered to be the static Eulerian grid. Also, if we indeed assume that the density represents the mean of the control volume, then the 1-D time discretization of ((2.6)) is as follows:

ρ (tn+1,xi)l (tn+1,xi)= ρ (tn,x )l (tn,x ) (2.7) ∗ ∗

where tn = n∆t, xi = i∆x represents the cell center location, x represents the backward tra- ∗ jectory cell center location, and l (t,x) represents the volume surrounding a cell center at a given time and cell center location (represented by l because the volume is a length in 1-D).

This form, however, is inconvenient because the density of the backward trajectory cell is not known as is. Therefore, the form that is used in practice (dividing out the Eulerian volume to obtain the new Eulerian density at the future time step) keeps the RHS in integral form as shown in (2.8) where xi∗ 1/2 represents the left bounding point and xi∗+1/2 represents the right − bounding point of the Lagrangian departure cell corresponding the Eulerian cell at the future time. x 1 i∗+1/2 ρ (t ,x )= ρ (t ,x)dx (2.8) n+1 i ∆x n i xi∗ 1/2 Z − From this form does the process of CISL methods become clear. By some means to be explained later, functions are fit to the mean values within each Eulerian cell to approximate

16 the subgrid distribution and obtain spatial accuracy of any desired order of convergence. After these functions are fit, to obtain the scalar values at the next time levels, the Eulerian boundaries are traced upstream to their departure locations. Then, to find the mass within each Lagrangian cell, the functions fit to the Eulerian cells are integrated between the Lagrangian boundaries.

Then, this mass is simply remapped to the Eulerian cell positions and divided by the Eulerian cell length to obtain the new scalar. If the Lagrangian cells extend the full length of the physical domain and do not overlap, then as long as the sub-grid distributions share the same integrated mean as the original cell means, mass is guaranteed to be conserved both locally and globally during the integrations over the Lagrangian control volumes. Therefore, it is absolutely nec- essary that whatever function approximates the scalar distribution within an Eulerian cell, that function must have the same integrated mean as the original cell mean as in (2.9) where xi 1/2 − represents the left bounding point and xi+1/2 represents the right bounding point of the Eule- rian arrival cell, and ri (tn,x) is the approximating function across an Eulerian cell identified by index i. Fig. 2.1 may be referenced for a visual representation of the definitions of xi 1/2, ± ρ ρ xi∗ 1/2, ¯i, and ˜i. ±

xi+1/2 ρi (tn,x)dx = Mi (2.9) xi 1/2 Z − e

The main purpose of this study is the application of new sub-grid approximation functions to the mass conserving cascade SL algorithm. Therefore, further explanation of the functions used are provided in detail in section (2.2). As one could assume, the easiest way to ensure the the entire physical domain is covered by the Lagrangian cells is to enforce a cyclic bound- ary condition across the endpoints of the physical domain such that the first and last boundary are physically considered to be the same and the integration past the physical domain wraps through the start of the domain. Fig. 2.2 gives three schematic examples of 1-D CISL upstream

17 U

ρ~ i−1

ρ~ i ρ~ _ ρ i+1 i−1

_ ρ _ i ρ i+1

x x i−1/2 i+1/2 x * x * i−1/2 i+1/2

Figure 2.1: 1-D CISL schematic of Eulerian and Lagrangian boundary and density approx- imation definitions. Black rectangles represent the cell density means, ρ¯i, black rectangle interfaces represent Eulerian boundaries, xi 1/2, dashed red lines represent Lagrangian bound- ± ρ aries, xi∗ 1/2, and dashed black lines represent the approximations to the subgrid density, ˜i. U indicates± the direction of wind flow, and the solid red arrows represent the backward trajectory tracing of the Lagrangian boundaries.

18 Lagrangian boundaries and their corresponding Eulerian boundaries. Notice that the wind in moving in the negative x direction, and the boundaries are tracked backward in the positive x direction. In 1-D with uniform flow (Fig. 2.2a), there is no limit to the Courant number because the only limitation to a SL advection scheme for pure advection is excessive deforma- tion known as the Lipschitz limitation (Lin and Rood, 1996; Smolarkiewicz and Pudykiewicz, 1992). In the figure, the Lagrangian boundaries are denoted by thick dashed red lines, the Eulerian boundaries are denoted by thin solid black lines, the length integrated over is denoted by a pink fill color, and the corresponding Eulerian cell to which it will be mapped is denoted by a gray fill color. The index of each boundary is marked in red and black for Lagrangian and Eulerian indices respectively, and the boundary labeled “7/1” is to emphasize that with the cyclic assumption, the first and last domain boundaries are the same. In Fig. 2.2a, the remapping of Lagrangian cell 3 (the cell between Lagrangian boundaries 3 and 4) back to the

Eulerian arrival cell is emphasized. Fig 2.2b shows the same example except in the context of approximated cells to show that integration is performed over the cell and then that mass is remapped back to the arrival Eulerian cell and divided by the cell length to obtain the new scalar. Fig. 2.2c shows the method for some nonuniform flow emphasizing the remapping of

Lagrangian cell 5 back to its Eulerian arrival cell wherein divergence is implicitly accounted for in by the changing lengths of the Lagrangian control volumes. In practice, the process is as follows. First, the grid is defined along the 1-D (we’ll say x) axis, and it may be regular or irregular where there are n + 1 boundaries enclosing n cells. The scalar mean for each cell is abstractly held to be located at the midpoint between the cell boundaries to allow convenient localization of coordinates when integrating. Next, the scalar profile across the grid is initialized. In the case of this study, 1-D initializations include a square wave, a triangle wave, a sine wave, an irregular profile, and a steep gradient profile. Through- out, the term “remapping” will be applied to the process of cell-integrated semi-Lagrangian

19 U

1 627/1 3 2 4 3 5 4 6 5 7

(a) 1-D CISL example with uniform flow

∆x

(b) Integrated region for (a) over approximated cell values

1 56 27/1 2 3 3 4 5 4 6 7

x+ (c) 1-D CISL example with nonuniform flow

Figure 2.2: Schematics of 1-D CISL remapping procedures. Black lines represent Eulerian boundaries, red dashed lines represent Lagrangian boundaries, gray shading represents the Eulerian arrival cell, and red shading represents the Lagrangian departure cell. U indicates the direction of wind flow.

20 advection, and this is because the process is essentially just a 1-D conservative remapping of the values from one grid onto another grid. In all of these test cases, the advection is constant in time (though most often not uniform in space); and therefore, the backward trajectories are only calculated once at the start of the simulation. The boundaries are traced backward, not the cell mid-points; and this is because it is desirable to form another control volume-based grid over which to integrate the subgrid approximation functions. With regard to implementing this integration procedure, the accumulated mass up to each Eulerian boundary is first calculated. Then, the algorithm finds the closest Eulerian boundary to the left and integrates over the func- tion in that cell up to the Lagrangian boundary. The accumulated mass up to each Lagrangian boundary is the sum of the accumulated mass up to the nearest left Eulerian boundary and the integrated mass from that boundary to the Lagrangian boundary using the subgrid approxima- tion functions as shown in (2.10 - 2.11) where ME and ML represent accumulated mass up to an Eulerian and Lagrangian boundary respectively and iˆ represents the index of the Eulerian cell which contains a Lagrangian boundary of index i 1/2: −

i 1 − ρ ∆ MEi 1/2 = ∑ j x j (2.10) − j=1

xi∗ 1/2 − ρ MLi 1/2 = MEiˆ 1/2 + iˆ(x)dx (2.11) − − xiˆ 1/2 Z − At this point, the mass accumulated up to each Lagrangiane boundary has been calculated.

To obtain the mass in each Lagrangian cell, the accumulated mass up to the left boundary is

simple subtracted from that of the right as shown in (2.12) where Mi∗ represents the mass within Lagrangian cell of index i.

Mi∗ = MLi+1/2 MLi 1/2 (2.12) − −

21 In the case of wrapping across the domain, an if-statement tests to see if the Lagrangian x- value at the right boundary is lower than that of the left. In this case, the accumulated mass across the physical domain is added to the right boundary’s accumulated mass, and then the left boundary’s accumulated mass is subtracted of like normal. After the mass has been calculated for each Lagrangian cell, those masses are remapped back to their corresponding Eulerian cells and divided by the Eulerian cell lengths to obtain the new scalar values in a mass-conserving way. What is nice about this system is that the framework remains exactly the same no matter what actual functions are used to approximate the subgrid distribution. Therefore, those func- tions can be plugged in and out with exactly the same interface when coding the algorithm. Because of this modularity, they will be explained separately, but for understanding the 1-D CISL framework, they are unnecessary details. For now, it needs only be assumed that they exist and retain the mass of the cell they represent during the integration process.

2.1.2 Cascade Dimensional Splitting

Running the SL algorithm in higher dimensions than 1-D can be troublesome because of the added complexity in integration and the creation of well-defined, non-overlapping Lagrangian cells that span the physical domain. In this model, the complexity is reduced using a di- mensional splitting technique called cascading which was first applied to the SL interpolation problem in Purser and Leslie (1991). Cascade interpolation is an efficient parallel to a more trivial dimensional splitting technique called a Cartesian product interpolation. In a third-order Cartesian product (or tensor product in more general form applicable to spherical geometry), typically the horizontal dimension is interpolated first with a cubic Lagrange interpolator (four interpolations in all). Then, a single vertical interpolation is performed using the four values obtained from the horizontal sweep. There are efficient implementations of this algorithm such

22 as the quasi-bicubic in Ritchie et al. (1995) where the top and bottom interpolations are linear instead of cubic with very little degradation in accuracy. However, this is not the most efficient approach to the problem of interpolation because to increase the order of accuracy from n to n + 1, in 2-D the number of points used in the interpolation process increases from (n + 1)2 to

(n + 2)2, and O(n2) calculations are necessary. Instead of performing all of the interpolations parallel to the Eulerian axes as in the tensor product form, cascade interpolation performs two sweeps of interpolation through an inter- mediate grid. This intermediate grid generation does constitute an appreciable overhead in the cascade method. However, for multiple tracer advection, the grid needs only be calculated once where in a Cartesian product interpolation, all the interpolations must be performed for every scalar. Throughout this section, Lagrangian latitudes and longitudes will refer to the curvilin- ear departure locations of the static Eulerian latitudes (or boundaries parallel to the x-axis in Cartesian geometry) and longitudes (of boundaries parallel to the y-axis in Cartesian geometry) respectively. Meridional and zonal directions when used in reference to Cartesian geometry re- fer to the y- and x-directions respectively. The first interpolation sweep is performed parallel to one of the Eulerian axes (though it is arbitrary, for this example we’ll say parallel to the Eulerian latitudes) to obtain the approximate values on an intermediate grid defined by the in- tersection of the Lagrangian longitudes with the Eulerian latitudes). Fig. 2.3 shows a schematic of the pointwise cascade process where solid squares represent the known data at the current time step, the red circles represent the upstream departure locations at the current time step, and the black circles represent the intermediate grid. In the first sweep, parallel to the Eulerian latitudes, the values are interpolated at the black circle positions defined by the intersection of the Lagrangian longitudes with the Eulerian latitudes. Then, in a second sweep, along to the Lagrangian longitudes, the final departure values are interpolated using Euclidean distance as the independent variable. Of course, in reality, the Lagrangian longitudes are approximated as straight line segments in Cartesian geometry and usually as great circle segments in spherical

23 Figure 2.3: Schematic of 2-D pointwise cascade interpolation. The rectangular black grid represents the Eulerian grid and the curvilinear red grid represents the Lagrangian grid. Filled black squares represent Eulerian grid points, filled black circles represent intermediate grid points, and filled red circles represent the target Lagrangian points. geometry. Then, the departure values (red circles) are simply assigned back to the Eulerian grid. Thus, only n interpolations are performed in n-dimensional space with the overhead of intermediate grid generation. To conserve mass in the method used in this study, it is necessary to integrate across mass- preserving approximating functions (i.e. CISL method) rather than use point-wise interpola- tion. This process of integrating over the length of a Lagrangian cell and dividing that mass by the Eulerian arrival cell’s length will be called remapping to differentiate it from the con- cept of interpolation (though remapping, in essence, is an integral form of interpolation) This can be done directly in 2-D without cascading such as in NM02 integrating over 2-D area for each Lagrangian cell and using quasi-biparabolic (the quasi- denoting the lack of cross-terms)

24 functions to approximate the subgrid distribution. However, such a method is not trivial to ex- pand into a 3-D regime. Tensor products are not only more expensive than cascading, but they are more difficult to implement when considering a series of 1-D conservative remappings. It would be very difficult to represent a deformed cell in a tensor product remapping because each of the remappings must be parallel to the Eulerian axes. The Lagrangian cell would seemingly be forced to be represented as a single rectangular region keeping it from more accurately rep- resenting the true deformed geometry. However, conservative cascading allows this geometry to be represented by more than one rectangular region. To perform cascading in a cell-based paradigm called the conservative cascade scheme (CCS) instead of in a pointwise paradigm, the intermediate grid generation is now used to define the intermediate cell boundaries where the intermediate cells are bounded by Eulerian boundaries in one direction and by intermediate Lagrangian boundaries in the other direction. The mass is calculated in the cells defined by these intermediate boundaries using the 1-D CISL method described previously in this section.

Then, the final Lagrangian boundaries are calculated in the other direction, and the masses in the intermediate boundaries are used to calculate the final mass in the full Lagrangian cell. The reason this method can provide a better representation of deformed Lagrangian cell geometry is better explained in a schematic.

Fig. 2.4 gives a schematic of the CCS process. Black lines denote the boundaries of the Eu- lerian cells, and red lines denote the boundaries of the Lagrangian cells. The first process is to generate an intermediate grid. In this case, the first sweep will be performed in the meridional direction parallel to the Eulerian longitudes. Therefore, the intermediate grid is defined as the intersections between Lagrangian latitudes and Eulerian longitudes denoted by blue slashes in the diagram. Green lines mark the intermediate cell boundaries calculated from the intersec- tion points left and right of each cell column, and these boundaries are always approximated to be parallel to the Eulerian latitudes. In the first sweep, the mass within each intermediate cell is computed using the 1-D CISL scheme employing the cyclic assumption. The light gray

25 areas denote the intermediate row containing the masses that will be used to calculate the fi- nal mass in the Lagrangian cell bounded by the points A∗B∗C∗D∗ which corresponds to the Eulerian cell ABCD. Then, the final grid is formed using the Lagrangian grid intersections

(intersections of red lines) denoted by the lines A∗C∗ and B∗D∗ in the schematic (thick dashed light blue lines). These as well are approximated by lines parallel to the Eulerian longitudes for computational simplicity (and efficiency for that matter) as integrating between two functional boundaries is more time consuming, more difficult, and requires more data storage. In the fi- nal sweep, the 1-D CISL method is performed over functions fitted to the intermediate masses to calculate the masses within the final Lagrangian cells. The reason the CCS can represent

Lagrangian cell geometry better than a tensor product is that it is inherently represented by adjacent rectangular regions as the 1-D remapping incorporates the y-shift implicitly (shown in the schematic). Therefore, the only time the concept of area is explicitly applicable is when the mass is remapped from say A∗B∗C∗D∗ back to ABCD, and to obtain the new scalar value the mass is divided by the area of the Eulerian arrival cell. However, this is trivial since the Eulerian grids in this study are regular and represented by rectangles. The particular order of cascading formally makes no difference. Considering implemen- tation which is readily adapted to spherical geometry, however, it is far easier to remap in the

meridional direction first for two reasons which will be explained in section 2.1.7. First, be- cause the meridians converge to a singularity at the poles, those regions must be isolated from the rest of the scheme for special handling. A meridional remapping for the intermediate sweep is most natural then because it can be bounded just beyond the polar regions and still calculate the mass in the polar caps (necessary to ensure mass conservation). Thus, the polar caps may

be easily isolated from the rest of the scheme. Second, it turns out that grid distortion requires a refinement in the meridional direction in which the intermediate cells in that direction are split into a given ratio, and a zonal sweep is performed on each of these refined cells. In this process the mass in each refined intermediate cell is needed, and this can be conveniently calculated

26 A * B A * B

C * D C * D

Figure 2.4: Schematic of conservative cascade scheme (CCS). The thin, black, rectangular grid represents the Eulerian boundaries and the thin, red, curvilinear grid represents the Lagrangian boundaries where the intersections of these boundaries form the corners of each cell. Thick, dark, blue slashes are the intersections of the Lagrangian latitudes with Eulerian longitudes, and the green dashed lines represent the North-South intermediate cell boundaries. Finally, the light blue dashed lines represent the East-West Lagrangian boundaries. In this example, the shaded intermediate row is used to calculate the mass in the target Lagrangian cell A∗B∗C∗D∗ which will be remapped to its corresponding Eulerian arrival cell ABCD.

27 using the same approximating functions used in the meridional sweep itself. Therefore, the first remapping is in the meridional direction, and the final sweep is in the zonal direction. On another quick implementational note, cascading is much more natural than tensor prod- ucts because only one integration over a Lagrangian cell is necessary to ensure mass conser- vation, and in 2-D (especially in Cartesian cyclic geometry) it is quite trivial to make sure the physical domain is completely spanned with the cyclic assumption. Also, it is easier to implement than a full 2-D integration because the approximating functions may be developed and tested in a 1-D context and simply plugged into a higher-dimensional scheme with no need for further modification. It has been noted in Lauritzen et al. (2006a) that the CCS from

NSS02 is much more efficient than the quasi-biparabolic method of NM02 and though CCS is generally less accurate, it is actually more accurate in the presence of a sloping Lagrangian latitudes because of the missing cross-terms in the quasi-biparabolic representation which do not capture such a change with full accuracy. Mainly, it is the implementational simplicity and computational efficiency of CCS that makes it so attractive.

2.1.3 Generating the Intermediate Grid

CCS as described earlier is a 2-D remapping algorithm which is broken into two 1-D sweeps through an intermediate grid. The intermediate grid is basically shifted either in the east- west direction or the north-south direction parallel to one of the standard axes. There are two main ways of generating the intermediate grid in cascade interpolation dependent upon which direction will be used for the first sweep. However, as mentioned earlier the first sweep must be in the meridional direction in order to isolate the polar caps when performing advection in spherical geometry. In the CCS as implemented in this study, the mass is calculated in a meridional sweep parallel to the Eulerian longitudes and then in a zonal sweep pseudo- parallel to Lagrangian latitudes. The term pseudo-parallel is used only because the North-

28 South boundaries are approximated by lines parallel to the Eulerian latitudes and do not curve with the deformed Lagrangian latitudes within the cells but rather approximate that curve with piecewise straight lines. To generate the intermediate grid, first one must find the intersection points between the

Lagrangian latitudes and the Eulerian longitudes. In Cartesian geometry, a simple linear ap- proximation is sufficient and does not greatly degrade accuracy (Bates et al., 1990; Mcdonald, 1987; Temperton and Staniforth, 1987). As will be discussed later, linear approximation of the boundary intersections is not accurate enough in spherical geometry. After these intersection points are found, the North-South boundaries of the intermediate cells are considered to be the average of the left and right intersection points. This forms the intermediate cells which will be integrated to obtain intermediate cell mass for each column in the meridional sweep. To generate the final East-West boundaries of the full Lagrangian cells, in Cartesian geometry it is sufficient to simply take the average of the North and South Lagrangian points for each

East-West boundary. From Fig. 2.4, this would correspond to averaging the y-values of A’ and C’ for the West boundary and averaging the y-values of B’ and D’ for the East boundary. Then, a final cascade sweep uses the intermediate masses to integrate the mass in these final Lagrangian cells which will be remapped to their respective Eulerian cells and divided by the

Eulerian cell area to obtain the new scalar values after one time step of advection. On an implementational note, it turns out that in order to have well-defined intersections without extrapolation in the zonal direction, Lagrangian points are needed to the left and to the right (for cubic interpolation, two points left and right) of every Eulerian boundary. In order to find these left and right points in an efficient binary search, the Lagrangian point array must be monotonic. If the points are wrapped immediately cyclically back into the domain, however, then the Lagrangian array is not monotonic, and this makes the process of finding these values more complicated. Also, in some cases such as with solid body rotation where the Lagrangian latitudes are tilted, one cannot apply a cyclic assumption to the boundaries

29 when calculating the intersection points because the right-most value is not the same as the left-most. The problem of finding the North-South boundaries for the intermediate cells turns out to be inherently non-cyclic in general. Therefore, an extra buffer region of Eulerian points is appended to the physical domain, and Lagrangian trajectories are calculated for these points according to the same rules as the others (constant advection, deformational flow, or solid- body rotation). The buffer needs only be large enough such that the Lagrangian points extend 1 (2) grid points beyond the physical domain for linear (cubic) intersection approximation. Conveniently, the same buffer region is used when forming the functional approximations since the approximation for the left-most and right-most Eulerian column cells needs information that goes beyond the physical domain. Again, the cyclic assumption holds for this as well. After the intermediate boundaries have been calculated, they are then wrapped into the physical domain if they extend beyond it to ensure mass conservation by spanning the physical domain completely. For details and a more formal mathematical expression of this process, see (Nair et al., 1999; Nair and Machenhauer, 2002; Nair et al., 2002).

2.1.4 Generating the Target Lagrangian Grid

Generating the final Lagrangian grid basically consists of calculating the East-West boundaries for each intermediate row (pseudoparallel to the Lagrangian latitudes). This is done in a fairly simple manner for Cartesian geometry though the process becomes more complicated in spher- ical coordinates. For a given intermediate row, an Eulerian East-West boundaries correspond to deformed Lagrangian East-West boundaries which in this scheme will be approximated by a straight line parallel to the Eulerian longitudes. Therefore, for each East-West boundary in each intermediate row, the average of the longitudes of the North and South Lagrangian depar- ture points are used to define the longitude of the final Lagrangian boundary. As an example, in Fig. 2.4, consider the intermediate row in which the cell bounded by A∗B∗C∗D∗ lies. The

30 Lagrangian East boundary of this cell A∗C∗ corresponding to the departure location of the Eu- lerian East boundary AC is approximated by simply averaging the longitudes of A∗ and of

C∗. There is one aspect of this averaging that theoretically may not be ideal. The calculated average is representative of the interval between the North and South Lagrangian departure points and not of the actual North-South intermediate boundaries. In theory, it would be more representative to calculate the intersections of the Lagrangian longitudes with the North-South intermediate boundaries and average those longitudes to obtain the interval. However, this pro- cess turns out to be not only computationally more expensive but is also a non-trivial process because the North-South boundaries represented by piecewise constants involve jump discon- tinuities in C0 space. Therefore, when Lagrangian longitudes cross an Eulerian longitude, one would need to come up with a method of resolving this intersection across a discontinuity. In practice, there is very little error associated with mapping the average between the North and South Lagrangian departure point longitudes to the intermediate cell interval because the cell interval approximates the Lagrangian interval with sufficient accuracy.

2.1.5 1-D Meridional Sweep

So far, the CCS description has been admittedly “fuzzy” in certain parts. This section and the next are intended to tie up the loose ends and give more defined explanations of what exactly occurs during the two cascade sweeps and how this process conserves mass. The grid means are defined for each cell at the beginning of each time step. In practice, since the 2-D flows used in this study are all constant in time (though usually not uniform in space), the backward trajectory calculations and intermediate and final grids are all determined prior to the simulation. The original scalars across the grid are equal to the mass of the cells divided by the area of the cells, making the scalar values in terms of both dimensions. The only invariant regarding dimensionality for the cells is mass itself which means that depending upon the

31 dimension of the cascade sweep, the scalar values in that direction differ for the same cell.

For example, suppose the mass of a cell is Mi, j, and in a 2-D context the density for that cell is ρ ∆ ∆ i, j = Mi, j/ xi y j . However, if one were to fit a function to these scalar values directly, then integrating over them in only one dimension (say the y-direction since this is the meridional sweep) would not truly be calculating the mass (2.13).

y i+1/2 ρ ρ ∆ ∆ i (tn,y)dy = i, j y j = Mi, j/ xi (2.13) yi 1/2 Z − Because of this fact, the scalare values must be preprocessed before performing the merid- ional sweep to make the scalars in terms of the meridional direction only by multiplying the grid spacing in the zonal direction (2.14).

ρ ρ ∆ ˆi, j = i, j xi (2.14)

After this preprocessing, integrating over conservative approximations fit to these scalars will yield the true mass. Therefore, the scalars are preprocessed, and then conservative approximat- ing functions are fit to those values. In the implementation of this phase, the mass is calculated

within each intermediate cell cyclically in the meridional direction just as described in the 1-D CISL method. These masses are then stored back into the same 2-D array that the original densities were stored to keep track of which intermediate cell corresponds to which Eulerian arrival cell. This is convenient because the data will processed along the intermediate cell rows

in the zonal sweep, and logically, this corresponds to the i-index of the same 2-D array with no distinction of North-South shifts of cell boundaries.

32 2.1.6 1-D Zonal Sweep

At this point in the scheme, the masses calculated in the meridional sweep are the values stored for each cell on the grid. Therefore, the locations of each row in the 2-D computational array are physically referencing the deformed curvilinear intermediate rows pseudo-parallel to the Lagrangian latitudes. In this sweep, conceptually the mass will be calculated along these Lagrangian latitudes by fitting conservative approximation functions to describe the subgrid distributions in the zonal direction and integrating over them in between the Lagrangian East- West boundaries calculated beforehand. Just as was true for the meridional sweep, before this zonal sweep is performed the scalars must be preprocessed such that integrating over them yields the true Lagrangian cell mass (2.15).

ρ ρ ∆ ˆi, j = i, j y j (2.15)

After the scalars are preprocessed for this sweep, the mass between each Lagrangian bound- ary is calculated in the same fashion as in the 1-D CISL scheme for each cell row. The mass value for each Lagrangian cell is remapped back to its corresponding Eulerian arrival cell and the Eulerian cell area is divided out to obtain the new scalar. At this point, conservative trans- port has been performed for one time step.

2.1.7 CCS in Spherical Coordinates

There are several adaptations required when converting from Cartesian geometry to spherical geometry due mainly to three problems: the polar singularities, the curvilinear latitude / longi- tude cell boundaries, and the extreme curvature in the polar regions. For simplicity in transition the same terminology for Eulerian and Lagrangian latitudes and longitudes are used throughout this section and the previous. For a traditional pointwise interpolating SL scheme, none of the aforementioned issues are particularly problematic. However, masses must be calculated via

33 integration in the CCS, and besides the obvious difficulty of integrating over a singular point, the calculation of the boundaries defining the intermediate and Lagrangian cells in the severely curving polar regions becomes a problem as well. Because the polar caps can be isolated in the meridional cascade sweep, their special treatment will be covered last.

2.1.7.1 Transforming the Coordinate System: (λ,θ) (λ, µ) → The curvilinear boundaries to each cell make it difficult to integrate a particular quantity over a cell as such an integral in spherical coordinates would be:

λ2 θ2 ρ (λ,θ)cosθdλdθ (2.16) λ θ Z 1 Z 1 where λ1 and λ2 denote the west and east zonal boundaries respectively, and θ1 and θ2 denote the south and north boundaries respectively. Analytically integrating with the multiplication of the cosθ term to the integrand makes the integration more difficult and computationally

expensive. Not only that, but the 1-D CISL scheme does not strictly hold if the integration procedure differs when integrating in the meridional sweep, and a method cannot be simply plugged into the 2-D framework. This makes development of new schemes more difficult as the code is not modular in this regard. Therefore, a simplifying coordinate transformation is

used in the CCS process for spherical geometry used also in Nair and Machenhauer (2002)

and Nair et al. (2002) where a variable, µ = sinθ is used as the independent variable in the meridional direction rather than latitude itself. In this (λ, µ) coordinate system, the integrated mass in a cell becomes:

λ2 µ2 ρ (λ, µ)dλdµ (2.17) λ µ Z 1 Z 1 Because the original ranges are typical of most implementations of geophysical spherical coordinates, λ [0,2π) and θ [ π/2,π/2] by definition implies that µ [ 1,1] with the ∈ ∈ − ∈ −

34 λ=0λ=π/3 λ=2π/3 λ=π λ=4π/3 λ=5π/3 λ=2π µ=1 (θ=π/2)

µ= sqrt(3)/2 (θ=π/3)

µ=1/2 (θ=π/6)

µ=0 (θ=0)

µ=−1/2 (θ=−π/6)

µ= −sqrt(3)/2 (θ=−π/3)

µ=−1 (θ=−π/2)

Figure 2.5: Schematic of the (λ, µ) grid. equator defined by µ = θ = 0. The (λ, µ) grid is conceptually just an irregular rectangular grid with a stretch applied in the meridional direction schematically demonstrated in Fig. 2.5. Therefore, in practice, this grid may be handled in exactly the same manner as the Cartesian grid as long as it is noted that the lines formed by µ = 1 and by µ = 1 are in reality just points − (the North and South poles respectively). Also, the transformation is area-preserving because dλdµ = dλd (sinθ)= cosθdλdθ.

2.1.7.2 More Accurate Intersection Calculations

In spherical geometry, especially when using (λ, µ) coordinates, it turns out that a linear ap- proximation to the intersections of Lagrangian latitudes and Eulerian longitudes is not accurate enough because a linear midpoint is being calculated to a curvilinear connection between two points. There are two methods of dealing with this problem as mentioned in NSS02. First, one may construct a great circle line passing between the two Lagrangian points left and right of the Eulerian boundary and perform the intersection calculation based on this equation. The

35 equations need not be given here because this method was stated in NSS02 as being less ac- curate although more computationally efficient. The method used in this study was simply to employ a fourth order accurate cubic interpolation polynomial based on Lagrange bases (cubic Lagrange interpolator) to calculate the intersection point. A Lagrange interpolation polynomial of order four is defined as follows where λ µ for a given Lagrangian latitude line refers to ( k∗ , k∗ ) the (λ, µ) location of the Lagrangian departure point just to the left of the Eulerian boundary being intersected:

2 λ µ λ L( ) = ∑ (k+i)∗ li ( ) (2.18) i= 1 −

2 λ λ λ − (k+ j)∗ li ( )= ∏ λ λ (2.19) j= 1, j=i (k+i)∗ (k+ j)∗ − 6 −

The intersection value, µˆi, is found by setting λ = λi in 2.18 and 2.19 where the subscript i represents the Eulerian boundary being intersected. Therefore, the location of the intersections on each Eulerian boundary is calculated: (λi, µˆi). In Cartesian space, the left and right inter- section values for each Lagrangian latitude for each cell column are simply averaged together to obtain the intermediate cell boundaries. However, in spherical geometry, to average these values in µ-space would be inaccurate because of the stretched grid. They are, therefore, aver- aged linearly in θ-space (using the actual latitudes) and the boundary value is then transformed back into µ-space.

2.1.7.3 Polar Cell Refinement and Polar Tangent Planes

It turns out also that the lines in the (λ, µ) coordinates are very unrepresentative of the actual great circles that connect two points near the polar regions which is easily pictured by noting the strong curvature of latitude lines near the poles. This is only a problem in the case of the

36 zonal cascade sweep near the poles where the East-West cell walls are approximated by lines. This approximation is not satisfactory in the polar regions of strong curvature, and therefore the cells in these regions are refined to give a better piecewise straight-line approximation to the East-West boundaries. This deficiency is described in particular detail in NM02. The refinement process consists initially of breaking M cells just outside the Lagrangian polar cap into N rows by subdividing the µ-coordinates between the South and North boundaries of the intermediate grid evenly in θ-space. The integers M and N in practice may be set however desired, and this refinement is performed prior to the simulation because the flows are constant in time. In reality, wind varies in space and thus trajectory calculations must be performed for each grid point. Trajectory calculations are a well-known computation necessary for all Lagrangian-form methods, and they constitute a significant computational load. Next, the Lagrangian East-West boundaries within each refined row are calculated simi- larly to the normal final grid calculation with a couple of exceptions. Instead of averaging the longitudes of the North and South Lagrangian points (which corresponds to a bisection in Cartesian geometry), a line is created through the North and South Lagrangian points and broken into N even segments. The midpoint of each meridional segment is held to be the East- West boundary for the respective refined row and for the respective column. The segmentation and midpoint calculations are all performed in polar tangent coordinates where a line more ac- curately represents the great circle that passes through the North and South Lagrangian points. The conversions to and from polar tangent coordinates are described in Table 2.1. The projec- tion is performed by forming a plane tangent to a pole (for this example, we’ll say the North Pole). Then, a line is passed from the South Pole through a given location on the surface of the sphere onto this tangent plane in which the (X,Y) coordinates are defined. Therefore, there exists a unique mapping on the North polar tangent plane for every point within the bounds

µ (0,1), λ (0,2π], and the same is true for the South polar tangent plane for every point ∈ ∈ within the bounds µ ( 1,0), λ (0,2π]. Technically, the plane’s axes are naturally stretched ∈ − ∈ 37 Table 2.1: Conversions to and from polar tangent coordinates for North and South Pole North Polar Tangent Plane South Polar Tangent Plane X = 2(1 µ)cosλ X = 2(1 + µ)cosλ − Y = 2(1 µ)sinλ Y = 2(1 + µ)sinλ p − p λ = tan 1 (Y/X) λ = tan 1 (Y/X) p − p − µ = 1 X 2 +Y 2 /2 µ = X 2 +Y 2 /2 1 − −   evenly in both the X and Y directions by a factor of a which is the radius of the Earth. Because the stretch is even for both axes, it can be set to unity and is set to unity in Table 2.1 as also done in NM02. During the simulation, the mass within each refined cell is calculated at the end of each meridional cascade sweep by integrating over the same approximating functions from coarse grid cell’s South boundary up to each new refined cell boundary. This way, mass is conserved and distributed properly into each refined cell in preparation for the zonal sweep. It should be noted that this process in no way increases the resolution of the spatial approximation function. Rather, it only increases the resolution of the East-West boundaries in the polar regions. Finally, a typical zonal sweep is performed on each refined row, and the masses in the refined cells for each column are summed to obtain the total mass to be remapped into an Eulerian cell in the polar region.

2.1.7.4 Local Tangent Planes for Zonal Boundary Calculations

Also, the present study made a further improvement from NM02 and NSS02 wherein the cal- culation of every East-West Lagrangian boundary was performed on a local tangent plane. The local tangent plane is similar to the polar tangent plane except that it is constructed on the North pole of a rotated sphere. This improves the result because linear intersections on a tangent plane correspond to more realistic curve intersections in spherical geometry. The transformation to and from rotated spherical coordinates, (λ ′,θ ′), is performed as described in Nair and Jablonowski (2007) and will be summarized here. Consider a rotated sphere whose

38 North Pole is located at the point (λp,θp). The transformation relations are given in (2.20) - (2.23). After the North and South Lagrangian points used for the determination of an East-West boundary are transformed into rotated spherical coordinates, they are then transformed into a North polar tangent plane using the equations in Table 2.1 with rotated input coordinates. After this, a line is passed between the two points and bisected to obtain the longitude of the East- West boundary. This is computationally more expensive than a simple average, but this is not of concern to the present study since the spatial approximations are the focus and the interme- diate and Lagrangian boundaries are only calculated once since the flows are constant. Also, the increase in accuracy in the typical error norms (L1, L2, and L∞, RMSE) was substantial so the computational expense may be acceptably offset by increased accuracy.

θ λ λ 1 cos sin( p) λ ′ (λ,θ)= tan− − (2.20) cosθ sinθp cos(λ λp) cosθp sinθ  − − 

1 θ ′ (λ,θ)= sin− [sinθ sinθp + cosθ cosθp cos(λ λp)] (2.21) −

cosθ sinλ λ λ ,θ = λ + tan 1 ′ ′ (2.22) ′ ′ p − sinθ cosθ + cosθ cosλ sinθ  ′ p ′ ′ p  

1 θ λ ′,θ ′ = sin− sinθ ′ sinθp + cosθ ′ cosθp cosλ ′ (2.23)    2.1.7.5 Treating the Polar Caps

Finally, the treatment of the polar caps is identical to that described in NM02 and NSS02. The polar caps are defined as the Lagrangian cell row bounded to the North and South by the Lagrangian latitudes in which the Eulerian pole is located. NM02 contains several graphics to help visualize this. In this study as was true in NSS02 and NM02, the meridional Courant

39 number was restricted to be less than unity such that the Eulerian pole is always located in the Lagrangian cells bordering the Lagrangian pole. This situation is schematically represented in Fig. 2.6 in which thin black circles represent Eulerian latitudes, thin black dashed lines rep- resent Eulerian longitudes, thick red ellipses represent Lagrangian latitudes, thick red dashed lines represent Lagrangian longitudes, the black dot represents the Eulerian pole, the red dot represents the Lagrangian pole, the area shaded light red represents the Lagrangian polar cap, and the thick blue dashed lines represent the grid refinement just outside the polar cap into three rows. The mass in the Lagrangian polar cap may be calculated during the meridional sweep because the zonal sweep transfers mass along Lagrangian latitudes leaving the total mass in the cap the same. The calculation for the North Pole is to first take the difference between the total column mass and the mass accumulated up to the next to last intermediate boundary for each column. The sum of those values across the North Polar cell row is the total mass enclosed in the Lagrangian North polar cap. For the South polar cap, the mass is simply the sum of the masses integrated up to the first intermediate boundary for each column. Since the mass cannot be integrated in a zonal sweep for the Lagrangian polar caps due to the Eulerian singularity enclosed in it, the total mass is just redistributed among the cell centers using a weighting. The weights for redistribution at the cell centers are calculated by employing a traditional bicubic semi-Lagrangian interpolation from the original scalar values which requires calculating backward trajectories for the Eulerian cell centers (performed in the beginning of the simulation). NM02 simply used the midpointsof the upstream cells which had already been calculated in order to find this location, but in this study it makes no difference because the upstream calculations are not the focus but rather they are the utility for studying the performance of non-polynomial spatial approximations. Because the bicubic calculations are local and not systematic, a tensor product interpolation is used with cubic Lagrange polyno- mials rather than a cascade dimensional splitting. Once the interpolated values are obtained for each Lagrangian polar cap cell center, the mass is redistributed using these values as weights

40 Figure 2.6: Schematic of Eulerian and Lagrangian polar region. Thin, solid, black circles represent Eulerian latitudes and thin, dashed, black lines represent Eulerian longitudes. The thick, solid, red ellipse and thick, red, dashed lines represent the Lagrangian latitudes and longitudes respectively. The red shaded region is the Lagrangian polar cap, the red dot is the Lagrangian pole, and the black dot is the Eulerian pole.

41 λ µ as show in (2.24 - 2.25) where IL represents the cubic Lagrange interpolator, ( i∗ , ∗) are the coordinates of the respective cell centers (µ has no subscript because these are all in the same

Lagrangian latitude row), and MT is the total mass in the Lagrangian polar cap, ρi is the new

density, ∆λi is the zonal grid spacing for cell i, and ∆µ∗ is the meridional grid spacing. After these values are calculated, they are remapped onto their Eulerian arrival cells, and mass is conserved.

λ µ wi = IL ( i∗ , ∗) (2.24)

ρ MT wi i = ∆λ ∆µ (2.25) i ∗ ∑ j w j One undesirable side-effect of using a simple Lagrange interpolator is that it is not a mono- tonic interpolation. Therefore, negative weights can indeed be created in the interpolation and weighting process which degrades accuracy and causes excessive global transport via the pos- itive definite filter. One alternative to this would be to employ a monotonic version of the cubic Hermite interpolator as described in Fritsch and Carlson (1980). This method works by

changing the derivative-like weights of the Hermite basis functions to zero if the derivative itself is zero yielding a monotonic cubic interpolation within the center of a four-point stencil. Fig. 2.7 shows a plot of the monotone Hermite interpolant and the cubic Lagrange interpolant within a four-point stencil with a discontinuous jump from zero to one at the domain center. It

is found that using the monotonic Hermite interpolant does yield superior accuracy; therefore, the monotonic Hermite is used for interpolation at the poles.

42 Monotone Hermite 1.4 Cubic Lagrange Data Points

1.2

1

0.8

0.6 y-axis

0.4

0.2

0

-0.2

0 0.2 0.4 0.6 0.8 1 x-axis

Figure 2.7: Plot of a monotone Hermite and a cubic Lagrange interpolant fit to a discontinuous jump from zero to one.

2.1.8 Positive Definite Filtering

As will be described in section 2.2, not all of the functions strictly guarantee monotonicity. Though complete monotonicity is not necessary in a model, complete positive definiteness is required for any positive definite quantities such as most moisture variables and chemical tracers. Therefore, in the event of small undershoots which become negative, a positive definite filter is developed to add mass to the negative cells and take that mass away from positive cells in a conservative manner. This process is necessarily arbitrary to some degree because the existence of negative values is unphysical and so is the correction of such. In this algorithm, one sweep is made across the global domain to add enough mass to all negative cells to bring them to zero, and this total amount of added mass is stored. Next, another sweep is made through the entire domain to see how many grid cells have enough mass to have a certain amount taken away from them. This is done because if too much mass is taken from a cell with only a small amount, that cell could become negative. The total mass of these viable cells is summed. Next, a weighted proportion of the total amount of mass added originally is taken

43 away from each of the viable cells in order to keep mass conservation. These weightings are simply a ratio of each cells mass to the total amount of mass of the viable cells. In this manner, the weightings sum to unity, and the mass is conserved. The reason more mass is taken from cells with more mass is that a smaller proportion mass is taken, and the effect is proportionally smaller than taking an equal amount from all cells. In practice, such a small amount of mass is distributed that the weightings make little difference to accuracy.

2.2 Sub-Grid Functional Approximations

Here we describe the conservative functions used to approximate the subgrid distribution of an advected quantity which is throughout this study held to simply be density. With the exception of PPM alone, all of these methods are new to the context of SL advection, and they form the focal point of this study’s first goal. In the following sections, four new non-polynomial approximations will be described in detail in their application to the CCS implementation of finite-volume SL advection: the Piecewise Hyperbolic Method (PHM), Piecewise Double Hy- perbolic Method (PDHM), Piecewise Double Logarithmic Method (PDLM), and Piecewise Rational Method (PRM). As will be explained in section 2.2.1, the Piecewise Parabolic Method (PPM) is the current quintessential standard for subgrid approximations as it is accurate, simple to implement, and conservative. PPM is not inherently monotonic, but with the application of a pre-processing limiter (Colella and Woodward, 1984) or a post-processing limiter (Nair et al., 1999 and ?), it can be constrained easily to be monotonic. These monotonic limiters are not always needed, however, in an operational context such as with the advection of wind which may be negative. PPM suffers well-known deficiencies in the presence of sharp gradients and local extrema, and in the last section a description is given of hybrid method in which other non-polynomial methods replace PPM in the presence of these conditions referred to as the PPM - Hybrid (PPM-H) method.

44 2.2.1 Piecewise Parabolic Method (PPM)

The PPM is quite likely the most well-known and studied method in subgrid approximation lit- erature. It was first developed for simulations of dynamics in a Eulerian Godunov-type context by Colella and Woodward (1984) (hereafter, CW84) wherein mass conservation was inherently ensured and monotonic and slope steepening limiters were described. Godunov methods, by the way, use characteristics inherent to the set of conservation equations in a given system to resolve fluxes across a boundary between cells with the assumption that fluxes are inher- ently discontinuous between two cells. To the author’s knowledge, with the exception of a post-processing filter developed in Zerroukat et al. (2005) which is applicable to PPM, the same limiters from CW84 are used in implementations of PPM today. Rancic, 1992 and Ran- cic, 1995 brought PPM into the semi-Lagrangian context by construction of a biparabolic 2-D approximation and then a 1-D PPM representation in a cascade framework. Carpenter et al.

(1990) better visualized the monotonic limiting of PPM and adapted the method for use in Meteorological modeling with a series of Meteorological idealized test cases to judge the per- formance of PPM. PPM was implemented in this study in the same manner as in Nair et al. (2002) wherein a unique parabola is fit within each cell during a cascade sweep with a mono- tonic limiter being used in the context of sufficiently steep gradients (which cause a parabola overshoot, see Fig. 2.8) and local extrema (which cannot be strictly guaranteed to be mono- tonic and are thus considered to be piecewise constant). Monotonicity is achieved when the approximation function is bounded within the neighboring cell densities for the entire domain enclosed by the cell in question. For instance, in the schematic provided in Fig. 2.8, a parabola is being constructed for the shaded cell which is neighbored by the two cells shown. Monotonicity requires that the ap- proximating function stay within the bounds of the neighboring cells (denoted by dashed lines in the figure). PPM works by fitting a unique parabola to the cell mean and the left and right

45 x+

Figure 2.8: Schematic of PPM undershoot and the result of monotonic limiting. The thin, dashed, black line represents the left cell’s average which the center cell’s approximation may not exceed if monotonicity is to be maintained. The red line represents the monotonically limited PPM approximation. interface values (which are determined beforehand with a using a conservative cubic interpo- lation). Sometimes, in the presence of steep gradients, the PPM overshoots or undershoots one of the neighboring cell mean values somewhere within the local domain. In these cases, the interface values must be altered (denoted by the red square) such that the parabola stays within the monotonic range. This degrades the accuracy of PPM because the interface values are no longer reconstructed up to full accuracy. The equation to obtain the interface values for an irregular grid and the monotonic limiter can be found in CW84. It should be noted for the sake of comparison that PPM requires a four- cell stencil to fit a unique parabola to a cell. This may seem strange given that a parabola has three coefficients and requires only three constraints. However, the process of reconstructing the boundary values uses a cubic interpolator which actually requires a four-cell stencil (be- cause a cubic polynomial has four coefficients to solve for). The pre-processing limiter used in this study (of CW84) consists of testing for the existence of an overshoot or the existence of local extrema (three tests in all). In the presence of local extrema (signaled by differing signs

46 between the left and right derivatives), the function is assumed to be constant by setting the interface values to the same value as the cell mean. In the presence of overshoots or under- shoots, one of the interface values are basically just brought a certain amount closer to the cell mean. After the interface values have been reconstructed and limiting has been performed, the construction of PPM may be performed. The first constraint which is essential to all of these methods is conservation which constrains the parabola to conform to (2.9). The other two con- straints actually fit the left and right parabola values to match the interface values (which at this point have been monotonically limited) as follows:

ρ ρ ρ ρ ˜ xi 1/2 = L, ˜ xi+1/2 = R (2.26) −   Under these constraints, a unique parabola may be obtained such that

1 2 ρi (ξ)= ρ +(ρR ρL)ξ + ρS ξ . (2.27) i − 12 −   where ξ is a normalized coordinatee given by

x x0 ξ = − , (2.28) ∆xi ρ ρ ρ x0 is the cell midpoint, i is the mean density of cell i, L and R are the left and right interface values of cell i respectively, ρi (notation that will be used throughout) represents the subgrid approximation function for cell i, and ρ is defined as e S

ρS = 6ρ 3(ρL + ρR). i −

As can be seen from the definition, x0 is constrained to the range: x0 ( 1/2,1/2). ∈ − To facilitate easier integration of (2.27), it is put into standard form:

47 2 ρi (ξ)= aˆξ + bˆξ + cˆ (2.29)

e

ρS aˆ =( ρS), bˆ =(ρR ρL), cˆ = ρ + (2.30) − − i 12   When implementing the PPM method in this particular formulation of the CCS, the only inte- gration that is needed is from the left / bottom cell edge of an Eulerian cell up to an arbitrarily located Lagrangian boundary within the cell. This is given by:

ξ ξ ∗ ξ 3 ξ 2 ∗ aˆξ 2 + bˆξ + cdˆ ξ = aˆ + bˆ + cˆξ (2.31) 1/2 3 2 1/2 Z−  − To give a good idea of what the PPM approximation looks like in an extreme test case (with a lot of local extrema), Fig. 2.9 shows the PPM representation of the subgrid distribution of an irregular signal taken from Zerroukat et al. (2005) with and without monotonic limiting. The figure can be misleading at first because the PPM monotonic seems to fit the scalar means fairly well. However, the whole purpose of a spatial approximation function is not to fit the means themselves but to accurately describe the distribution within the cells. As the figure shows, the monotonic limiting of PPM severely inhibits the innercellular variation and degrades the order of accuracy of the solution.

To briefly summarize the process of PPM:

1. Interpolate all cell interface values from cell means using a cubic, monotone, conserva- tive interpolation (requires four cells of information, thus PPM’s four-cell stencil)

2. For all cells, apply the monotonic limiter to alter the left and right interface values to

ensure a monotonic parabola within the cell. For cells with local extrema, interface values are both set to the cell mean (piecewise constant representation). For cells in which the current parabola would be non-monotonic within the cell, one of the interface

48 4 4

3.5 3.5

3 3

2.5 2.5

2 2 Density Density

1.5 1.5

1 1

0.5 0.5

0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x-axis x-axis (a) PPM representation without monotonic limiting (b) PPM representation with monotonic limiting

Figure 2.9: Output of PPM representations of subgrid distribution for irregular signal. Black boxes are the actual cell means, and red lines are the piecewise parabolas fit to the means.

values is altered in order to guarantee a monotonic parabola.

3. For each cell, using the cell mean and left and right interface values, create a unique, conservative, and monotonic parabolic function to describe the inner-cell distribution using (2.27).

2.2.2 Piecewise Hyperbolic Method (PHM)

The PHM was first introduced by Marquina (1994), hereafter M94, to advective transport in an Eulerian Godunov-type context as well. The method was very recently improved by applica- tion of a new type of limiter in Serna (2006), hereafter S06. As shown in S06, two advantages of using hyperbolas to approximate the subgrid distribution rather than parabolas is that hyper- bolas are less oscillatory and only require a three-point stencil to use whereas PPM requires a four-point stencil. As the name implies, this method works by fitting piecewise hyperbolas to the cell means to describe the subgrid variation. The hyperbolas used in this scheme are simple rectangular hyperbolas of the form

49 1 f (x)= C1 + (2.32) C2 (x x0)+C3 −

in which C1, C2, and C3 are arbitrary constants. The actual form used in M94 is given by

hi 2 αi hi ρi (x)= ρ¯i + di ln − (2.33) α2 α hi i 2 + i − (x x0) α !   − − for cell i in which hi ise the grid spacing, x0 is the cell center, and di and αi are parameters that define the hyperbola.

The parameters αi and di are defined according in M94 study with an averaging to limit the parameter di and make the solution Local Total Variation Bounded (LTVB). The limiting of

PHM is accomplished by limiting the derivative parameter di, and the value for the parameter

αi is modified for consistency. PHM achieves its third-order accuracy by using a second-order approximation to the derivative at the cell centers using the centered derivatives at the left and right interfaces calculated as an average derivative across the interfaces using the cell means and the distance between cell centers. The left and right interface derivatives are defined as

ρ¯i ρ¯i 1 dL = − − (2.34) x0,i x0,i 1 − −

ρ¯i+1 ρ¯i dR = − (2.35) x0 i 1 x0 i , + − , where x0,i represents the cell center location for cell i. As with PPM, constraint (2.9) is essential for conservation and is applied to PHM. The other constraint used is that PHM interpolate one

of the lateral derivatives to second-order accuracy such that either (2.36) or (2.38) is true.

ρ 2 ˜x xi 1/2 dL = O h (2.36) − −  

50 2 ρ˜x x dR = O h (2.37) i+1/2 −   In any Godunov-type context, it is necessary to prove that a scheme does not add total variation (TV) to a solution in order to prove formally that the method converges to a weak form of the advection equation. This condition of not adding total variation to the solution (or adding an amount that is bounded) is called total variation diminishing (TVD). To prove that each cell’s variation is bounded locally is also to prove that the scheme as a whole is TVD and thus converges to the correct solution. Since the PPM is proven monotonic, it is implied that the scheme is also local total variation bounded (LTVB) because no new extrema may be produced, and existing extrema are not magnified. However, the PHM method is not based on the actual left and right interface values but on the left and right interface derivatives which makes a strictly monotonic limiting infeasible. M94 proves the solution to be LTVB, however, which in practice limits the magnitude of the undershoots and overshoots as it is applied in a SL framework. It was found in M94, however, that this averaged limiting of the derivative still required significant smoothing in the case of nonlinear flows to keep the scheme stable.

Therefore, M94 applied a harmonic limiting to the parameter di. This was found to acceptably control the solution’s variation providing inherent stability and yielded fairly accurate results. The concept of LTVB is actually directly applicable only to the Eulerian Godunov-type context in proving that the approximated fluxes will allow the method to converge to a weak (integral form) solution of the advection equation. However, as stated in M94 and Serna (2006), hereafter S06, if a method is LTVB, it is also total variation diminishing (TVD) meaning that the total variation of the solution is non-increasing. Another form of this condition states that the total solution variation must be absolutely bounded by a time-independent constant. There are two typical forms of the Total Variation (TV) constraint in 1-D wherein TV is defined by

(2.38) or (2.39).

51 ∂ρ dx (2.38) ∂x ΩZ

∑ ρi+1 ρi (2.39) i | − | Usually, it is inconvenient to describe the TV with the integral form, and therefore it is most often quantified in its discrete summation form. However, in this case, since PHM is formed by approximating the left and right derivatives dL and dR, the LTV inside of a hyperbolic approximation cell is defined as

TVi = hi dL dR . (2.40) | |·| | p The TV across the entire domain is simply the sum of the LTV values for each cell, and if all local cells are LTVB, then the total solution must be TVD. These concepts are not directly applicable to the CISL framework, and therefore, they cannot directly prove that a CISL method is stable and convergent to a weak solution of the ad- vection equation. However, what LTVB and TVD do imply is that the left and right derivatives of the hyperbolas are indeed bounded. Because the derivatives at the left and right interfaces bounded and the mean must be the same as the original cell means, then we are guaranteed that the overshoots and undershoots are bounded as well. Therefore, when integrating over these approximations, it is guaranteed that monotonicity will only be violated within a given tolerance. This concept of monotonicity within a given tolerance is not unique to this study. In a short paper byFitzgerald et al. (2005), theconcept of a sequence being quasi-monotonic was quantified such that for a monotonically increasing sequence, a certain amount of mono- tonicity is allowed to be violated such that

52 max ρi ρ j tol. (2.41) i, j ,i> j − ≥− ∀{ }  Now, in the CISL context, almost no initial data will actually be monotonic, and therefore the term takes on a new meaning such that a subsequence of data which is monotonic must remain monotonic. For instance, steep gradients pose problems because overshoots or undershoots tend to happen in which new extrema are created breaking the monotonicity of the data at the previous time step. Allowing this form of monotonicity to be violated within a bounded amount is not unique to this study. Monotonicity has also been allowed to suffer violation as well to a small extent in a wide class of methods called essentially non-oscillatory (ENO) methods (Harten, 1997 and Harten et al., 1987). As long as monotonicity is constrained within a certain time-independent tolerance, then even in the case of a 1D square wave, Gibb’s phe- nomenon should be well-limited as well. This proposition in application to the CISL method, however, should be rigorously tested in a non-linear framework before being held confidently. Throughout this study (that is, for PDLM, PDHM, PRM, and further modifications on PHM), when fitting the approximations, the LTVB condition will be strictly enforced on each cell to ensure the methods are quasi-monotonic.

The parameters from M94 are not described in detail in this paper because they are not the actual values used. This is because very recently, S06 has developed a new and more accurate limiter for the PHM called a power limiter. In the study, the powerenop function is given by

powerenop (dL,dR) = minsign(dL,dR) powerp ( dL , dR ) (2.42) | | | | based on the following two functions:

53 a + b a b p powerp (a,b)= 1 − , (2.43) 2 − a + b  

sign(dL) i f dL dR | |≤| | minsign(dL,dR)=  . (2.44)  sign(dR) i f dL > dR   | | | | 

The input for powerenop are the the left and right interface derivatives in order to obtain a limited estimate of the derivative at the cell center. It is proven in S06 that this limiter makes the PHM scheme LTVB. The derivative parameter di is, therefore, given by

2 2 2 dL + dR + 2(max( dL , dR )) di = powereno3 (dL,dR)= minsign(dL,dR)min( dL , dR ) | 2| | | | | | | ( dL + dR ) | | | | (2.45) where powereno3 is cast in a more efficient form for p = 3, and the corresponding value for αi is given by

2 2 dL+3dR 2 2 1 i f dL dR ( dL + dR ) − | |≤| | α = | | | | . (2.46) i  r 2 2    3dL+dR   2 1 2 i f dL > dR  − ( dL + dR ) | | | |  r | | | |    It is obvious from (2.33) that two cases would render a singularity. First, if a case were encountered such that αi = 2, then either the natural logarithm term would become undefined | | or the denominator of the ratio within the natural logarithm term would become undefined resulting in a NaN (Not a Number) value during computation. This, however, never happens

because, as S06 demonstrates, the definitions of αi restrict αi such that it is always within the range 2 √3 1 αi 2 √3 1 . The second and more obvious case in which a − − ≤ ≤ − singularity could (and does, as it turns out) occur is if αi = 0 since this is well within the range of possible values.

54 Because the S06 context is a Godunov-type method, the singularity can be removed with ease. However, in this context, a continuously integrable function is necessary across the whole domain of a cell which makes the situation more complicated. To gain more understanding of

this singular case, the terms in the expressions for αi are expanded in (2.47).

2 2 dL+3dR 2 2 2 1 OR d +d +2 dL dR α = L R | || | − (2.47) i  r 2 2    3dL+dR   2 1 2 2  − dL+dR+2 dL dR  r | || |    The only way α can be zero is if the square root expression is rendered unity or in other words if either (2.48) or (2.49) are true.

2 2 2 2 d + d + 2 dL dR = d + 3d (2.48) L R | || | L R

2 2 2 2 d + d + 2 dL dR = 3d + d (2.49) L R | || | L R

2 2 These can be trivially reduced into the relations: 2 dL dR = 2d or 2 dL dR = 2d . Clearly, | || | R | || | L the only way either of these relations can be true is if dL = dR. Since the left and right deriva- tives are calculated based on the cell means as a centered difference, this implies that the variation within this cell is best approximated by a straight line which is a second-order ac-

curate representation in the CISL framework. Therefore, if the derivatives are close enough

based on tolerance, (that is if αi tol), then at that location, the solution is represented by a | |≤ piecewise linear approximation given in (2.50). When implemented, the tolerance was set to h2

1 where h is defined as h = N where N is the number of cells. Note that this paper distinguishes h from hi where hi is equal to the grid spacing of a single cell and h is equal to the average grid spacing. Also, as given in the algorithm in S06, if the magnitudes of both the left and the right derivatives are less than a predefined tolerance (same as before), then both di and αi are set to

55 4 4

3.5 3.5

3 3

2.5 2.5

2 2 Density Density

1.5 1.5

1 1

0.5 0.5

0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x-axis x-axis (a) Hyperbolic fitting with an exponent p=3 (b) Hyperbolic fitting with an exponent p=3.95

Figure 2.10: Output of PHM representations of subgrid distribution for irregular signal. Black boxes are the actual cell means, and red lines are the piecewise hyperbolas fit to the means.

zero leading to a piecewise constant expression.

ρi (x)= ρ¯i + di (x x0) (2.50) −

Fig. 2.10 shows a plot of the PHMe representations of the same initial data as in Fig. 2.9 to give a feel for how PHM’s functions look with fairly noisy initial data. It is clear that the cell reconstructions violate the neighboring cell means, but it will be shown later that PHM inherently limits the LTV when integrated. Additionally, it is visually evident that PHM cannot handle local extrema. Clearly as well, increasing the power limiter exponent allows more variation within cells. The “spikes” as in the cell just right of location x = 0.5, are indicative of local extrema and how PHM reconstructs those cells. Note that the smaller magnitude lateral derivative is interpolated in the process which is consistent with the mathematical formulation. This is part PHM’s limiting mechanism. It is stated in S06 that the power limiter method was more effective because it allowed more

variation within the approximated cells as the range for α increased from 2 √2 1 αi − − ≤ ≤   2 √2 1 in M94 to 2 √3 1 αi 2 √3 1 in S06. It is also stated that a sufficient − − − ≤ ≤ −   condition for LTVB in PHM is to make sure that the value αi is restricted to the range (??)

56 which basically means that the constraint for avoiding logarithmic singularities is the same as the LTVB constraint. Therefore, in this study, different powers of the power limiter are investigated to see if even more cell variation further improves the solution and are checked to ensure that the solution is still LTVB. Because the power limiter applies the exponent to an absolute value expression, any real exponent may theoretically be used without the need to treat complex results. The value αi depends on the ratio,

di ηi = , (2.51) min( d , d ) L R | | | | in the relationship given in (2.53). Therefore, for the powe r limiter given in (2.42), for any real value p, both di and αi are now known allowing for a unique hyperbola to be constructed. It

should be mentioned that there exists the possibility of singularity in the function ηi if one of the derivatives is zero. Therefore, a number, ε, which may be as small as machine precision is

used to keep this divide by zero from happening such that

di + ε ηi = . (2.52) min( d , d )+ ε L R | | | | ε ε 10 When implemented, is set to = 10− as there is little sensitivity to it. It is clear from (2.53) that in order to fulfill the requirements for LTVB ( 2 < αi < 2), ηi is restricted to the range − ηi < 4 for all cells. Therefore, for any possible combination of the values dL and dR , the | | | | maximum value of ηi must be restricted to be less than four to ensure that the method is LTVB and thus restricts overshoots to provide for a stable CISL method.

2[√η 1] i f d d α i L R i =  − | |≤| |  (2.53)  2[1 √ηi] i f dL > dR   − | | | |  To facilitate constraining the exponent, p, used in the power limiter to allow for more variation within cells yet also keep the method quasi-monotonic, it is best to see how the ratio

57 function, ηi, behaves based on this exponent since constraining its maximum value to less than

four is essential. Fig. 2.11 shows four graphs of ηi based on differing values of the exponent

and over a wide range of dL and dR. Because absolute values are used throughout the power limiter, all values used in these figures are positive. Also, since it is stated in S06 and evident

from (2.42) that ηi is based on the ratio of the input numbers, dL is held to be unity and dR is x 1 given values 10 where x is increased from 0 to 10 by increments of 10− . From the plots, it is evident that in general,

η p max ih i = p (2.54) dL , dR ∀{| | | |} η p where ih i is the notation used to describe the ration function for cell i based on a power limiter of exponent p. This is exactly what was obtained by S06 wherein the power3 limiter was shown to have a theoretical maximum value of exactly 3.

However, it was noted in this test that the values for ηi can actually exceed a given p because of truncation errors in the floating point arithmetic. Therefore, it is more accurately

stated in an implementational sense that

η p ε max ih i = p + (2.55) dL,dR   where 0 ε O 10 5 is the bounding constraint found in practice even with an exponent ≤ ≤ − testing mesh of 0.01. Therefore, even using p =(4.0 ε)= 3.99999 should generally not − be regarded as stable and LTVB considering numerical implementation for the PHM power- limited method. However, it should be safe to use any value such that p < 3.999 to facilitate a greater amount of variation within a cell.

To show the effects of allowing a greater amount of variation in the PHM reconstruction,

Fig 2.10b gives a plot of the PHM representation with p = 3.95 which is nearing to the limit of stability and convergence for PHM advection. As can be easily seen from the figure, there

58 4

3.5

3

2.5 Ratio Value

2

1.5

1 1e-10 1e-09 1e-08 1e-07 1e-06 1e-05 1e-04 0.001 0.01 0.1 1 |dl / dr|

Figure 2.11: Values for the ratio of the power limiter value / min value for a wide range. X- axis is displayed with a log-scaling. The black, red, blue, and green lines represents the plot of exponents of 3, 3.5, 3.9, and 4.0 respectively. is more variation within the cells as the exponent is increased, and one can almost immediately expect overshoots and undershoots to be greater because of this effect leading to a less strict tolerance when considering quasi-monotonic properties. The relative magnitudes of change between p = 3 and p = 3.95 seem to be greatest in jump discontinuities and least with smooth data. This would seemingly be a desirable effect in resolving sharp boundaries. It should

also be noted that while the approximations may seem to have large overshoots, the extremes cover a very small area compared to the rest of the cell, and the actual mass calculated in the CISL scheme will be much smaller when integrated. This is not to mention that they only occur at jump discontinuities, and when integrating across the bounding cells (as Lagrangian boundaries generally cover more than just one Eulerian cell), the effect of PHM functional

overshoots will also be naturally reduced this way. The results section will show whether or not this actually improves PHM’s advection properties and accuracy. S06 noted that adding

variation by lessening the restrictions on αi increased accuracy. Therefore, to be able to more

flexibly refine the bounds of αi via the rational exponents as introduced in this study may allow

59 0.1415 0.089

0.141 0.0885

0.1405

0.088

0.14 L2 Error L2 Error 0.0875

0.1395

0.087 0.139

0.1385 0.0865 2.9 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 2.9 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 PHM Exponent PHM Exponent (a) Square wave (b) Irregular signal

Figure 2.12: L2 error norms for varying PHM exponents on a grid of 80 Cells

for a greater optimization of accuracy in PHM. It is important to obtain some sort of optimal value for the PHM power exponent for the most effective comparison with PPM. Because this framework is truly 1-D, it is not computa- tionally infeasible to take a brute force approach to finding the optimal exponent for each of the five different types of initialization data (which may be reviewed in Section 2.4). Therefore, an experiment is run testing the error norms of the solutions with PHM using 1,000 different exponents incremented equally within the range: p (3,4] for a computational mesh size of ∈ 80 cells. The graphs of the L2 error norms are given in Fig. 2.12 for the square wave and the irregular signal (those profiles themselves are displayed in Fig. 2.20). The sine wave, steep gradient, and triangle wave plots are omitted because their shape is very similar to that of the square wave. The experiment was also repeated for 160 computational cells, and the results

were very similar. It turns out that increasing the exponent towards four increases accuracy, and there is no optimum point within the range save at the end of it. For this reason, an exponent

value of p = 3.999 is used for PHM throughout this study. To summarize the process of PHM reconstruction:

1. For each cell, calculate the left- and right-hand derivatives (dL and dR respectively) based

60 on a simple centered differencing with the neighboring cell means (three-cell stencil) shown in (2.34) and (2.35).

2. For each cell, using the left and right derivatives for the cell, construct a limited estimate of the centered derivative, d with the desired exponent in the power limiter. If the lateral

2 derivatives magnitudes are both less than tolerance (tolerance is defined as the n− where there are n cells), d is set to zero. If not, d can be calculated using the generic powereno limiter given in (2.42) using any power p which is typically higher than three but must be strictly less than four. Calculating d in this manner ensures the hyperbola will accurately

interpolate the centered derivative estimate.

3. For each cell, calculate a value of α such that the hyperbola both interpolates the centered derivate and one lateral derivatives (namely, the lateral derivative of smaller magnitude). As before, if the lateral derivative magnitudes are both less than tolerance, then set α to zero. Otherwise, the equation for this calculation is given in (2.53) and (2.52).

4. For each cell, construct a unique, limited hyperbola within the cell using d and α follow- ing (2.33) as long as the magnitude of α is not less than tolerance. If the magnitude of α is less than tolerance, then construct a line given by (2.50) within the cell.

2.2.3 Piecewise Double Logarithmic Method (PDLM)

The PDLM method was developed in Artebrant and Schroll (2006), hereafter AS06, based on similar derivation to the Piecewise Logarithmic Method (PLM) in Artebrant and Schroll (2005). The reason PLM was not used in this study is that a tolerance is not readily applied to the function making it difficult to obtain a continuously integrable single logarithmic function across each cell domain. In AS06, double logarithmic functions are used without limiting to yield a TVD, conservative third-order scheme with a three-cell stencil again within the Eulerian

61 Godunov context. The density approximation function for a given cell i is defined by

1 xi+1/2 ρ˜i (x)= ρ¯i + φi (x) φ¯i = ρ¯i + φi (x) φi (x)dx (2.56) − − hi xi 1/2 Z − where φi is the actual double hyperbolic function,

cihi h 2 dh h 2 φi (x)= ln x x0 1 ln x x0 1 , (2.57) − a − − 2 a − − b − − 2 b − i       the coefficients ai, bi, ci, and di are defined as follows:

(ai 1)[dR (1 bi) dL] ci = − − − , (2.58) bi ai −

di = dL ci, (2.59) −

ai bi = , (2.60) ai 1 −

q q 2 dL dR +tol ai =(1 tol) 1 +tol |2q | | | 2q (2.61) − − dL + dR +tol ! | | | | dL and dR represent second-order centered approximations of the density derivatives at the left boundary and right boundaries respectively given by (2.34 - 2.35), q = 1.4, tol = hq/10, h = 1/N, and hi and x0 represent the grid spacing and cell midpoint of cell i respectively. The

bounding values xi 1/2 and xi+1/2 represent the left and right cell interface locations for cell i − respectively. From the definition of the (2.56), it is obvious by application of the integral mean

value that this method achieves conservation by subtracting off the mean value of whatever double logarithmic function develops in (2.57). This does lead to a certain amount of overhead

62 and also to some computational inaccuracies which will be discussed later. This overhead is minimized by storing the calculated average value, φ¯, for each cell when calculating the original function coefficients so that it does not need to be recalculated during integration. The method is proven to be LTVB as the value for q is required to be 1.4 yielding a scheme

with bounded violation of monotonicity. Because it is suggested in AS06 that q not be changed for the general case, it is kept in this study to 1.4. Within (2.57) alone, there exists the possibil-

ity of a singularity if a = 0 (which would also lead to b = 0 given the definition of b). However, the purpose of the tolerance introduced in (2.61) is to keep the parameter a from taking on the value zero and restricting it such that it is bounded as well. Therefore, there is no need to treat

a case of this singularity in PDLM. There does exist the possibility of a negative value inside the logarithmic expressions, however, and in practice if (2.57) is used, this does indeed occur. To resolve this matter, one could simply solve the logarithm in complex space and take the real part afterward. However, complex logarithms require both more space and more computational

time, and it turns out that the identity,

ℜ(ln(C)) = ln( C ), (2.62) | |

holds true wherein the function ℜ represents the extraction of the real part of a complex num- ber. Therefore, an absolute value is taken of the logarithm argument, and the equation used in practice is

cihi hi 2 dihi hi 2 φi (x)= ln x x0 1 ln x x0 1 . (2.63) − a − − 2 a − − b − − 2 b − i   i   i   i  

For computational efficiency and convenience in integration, the parameters α and β are intro- duced and defined as:

63 hi 2 αi = x0 + 1 , (2.64) 2 a −  i 

hi 2 βi = x0 + 1 . (2.65) 2 b −  i  This means the integral mean value, φ¯,is

x xi+1/2 1 i+1/2 ci di φ¯i = φi (x)dx = x αi [ln( x αi ) 1]+ x βi [ln( x βi ) 1] . (2.66) hi x − ai | − | | − | − bi | − | | − | − Z i 1/2 xi 1/2 −   −

The integral of the PDLM density approximations from the nearest left Eulerian boundary

(xiˆ 1/2) up to a given Lagrangian boundary (xi∗ 1/2) is similarly given by − −

x x∗ i∗ 1/2 cˆhˆ dˆhˆ i+1/2 − ρ ρ φ¯ i i α α i i β β ˜iˆ(x)dx = ¯iˆ iˆ x + x iˆ ln x iˆ 1 + x iˆ ln x iˆ 1 (2.67) xiˆ 1 2 − − aiˆ − − − biˆ − − − x Z /   iˆ 1/2 −       −

and the cell approximation parameters at iˆ are used because this is the index of the Eulerian cell over which we are integrating. Fig. 2.13 shows a plot of the PDLM approximation to the same irregular signal problem as with PPM and PHM to give a visible indication of how the method is performing. As with the PHM, PDLM has no strict monotonic limiting and therefore is limited by the LTVB

constraint placed on the exponent q presented earlier. One downfall of this method is that it is more computationally expensive with the need to calculate the integrated average and the need to evaluate four natural logarithms for every integration procedure along with many multiplications and divisions as well. The major benefit of this method as shown in the figure

is that because PDLM is a combination of two logarithms, it is not inherently monotone and is capable of resolving local extrema up to full accuracy. It was discovered, however, in the development of this scheme that there exist certain inherent computational difficulties in PDLM

64 mainly during integration. Fig. 2.14 shows two plots of a sine wave being approximated by PDLM in single floating point precision instead of double precision. A sine wave is used because it illustrates well the conditions under which PDLM degrades, and that is when the left and right derivatives are roughly equal.

When the derivatives are nearly equal, the parameter values ai and bi reduce down to ai = tol (1 tol) tol and bi tol/(tol 1) tol. The error is clearly in the integration − ≈ ≈ − ≈− procedure because of the observation that the functional shape seems to be calculated correctly, but the integrated mean seems to be off by a great amount. Since under the above conditions, ai and bi are similar, it seems likely that the dominant cause of error is floating point cancel- lation between the two main terms since the sign of the error alternates erratically with each successive cell and ai and bi are of opposite sign yet similar magnitude. In fact, it turns out that in these conditions, ln(xR α) ln(xL β) and ln(xL α) ln(xR β), to the point of − ≈ − − ≈ − even being identical in single precision floating point representation.

4 For instance, in the first cell of the Fig. 2.14b, the values were: a1 = 2.258345 10 × − 4 and bi = 2.258855 10 which causes problems in floating point arithmetic because single − × − precision only supports almost seven significant digits in a base-ten system (corresponding to 23 binary bits in the IEEE 754 standard). One can be certain that this problem will present

itself again during the real CISL integration since the same equation is used with different bounds further degrading the solution. This problem does not present itself as a perceivable degradation in accuracy in double precision (which supports roughly 15-16 significant base- ten digits of precision in IEEE 754 standard) calculations until the number of cells is increased to about 300 and greater, but it is still something to be weary of if implemented in a more

complex framework. For instance, if this were used in an regional climate model (RCM) based on spherical coordinates, such mesh sized certainly could occur, and this would degrade the performance and even stability of PDLM. This is not to mention that double precision calculations require twice the memory and also more computational time to perform and are

65 4

3.5

3

2.5

2 Density

1.5

1

0.5

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x-axis

Figure 2.13: PDLM representation of irregular signal with 20 cells. The black boxes are the cell means, and the red lines are the PDLM approximations. typically avoided when possible when operationally implemented. To summarize the proces of PDLM reconstruction:

1. For all cells, calculate the left anf right derivatives using (2.34) and (2.35) as with PHM

(three-cell stencil).

2. For all cells, calculate the parameters a, b, c, and d using (2.58-2.61).

3. For all cells, construct a unique double logarithm function, φ to describe the inner-cell distrubution using (2.63) and (2.64-2.65).

4. For all cells, calculate the integrated mean of the double logarithmic function, φ¯, using (2.66).

5. The overall limited, conservative double logarithmic function is now described by (2.56).

66 2 7

6

1.5 5

4 1

3

0.5 2 Density Density

1

0 0

-1 -0.5

-2

-1 -3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x-axis x-axis (a) PDLM representation with 40 cells (b) PDLM representation with 80 cells

Figure 2.14: PDLM representation of irregular signal with single precision calculations of integrated means. The black boxes represent the actual means, and the red lines represent the single precision PDLM approximations to those means. The large errors are caused by floating point arithmetic problems.

2.2.4 Piecewise Double Hyperbolic Method (PDHM)

This method was also developed in AS06 with a similar derivation as the PDLM also applied to an Eulerian finite-volume context, and it has little connection with the power limited PHM developed in M94 and S06 save only the fact that it indeed uses hyperbolas to approximate the inner cell density distribution. As the name indicates, this method uses two hyperbolas

to approximate the subgrid variation to third-order accuracy using a three-cell stencil, and the equation for this fitting is

1 xi+1/2 ρ˜i (x)= ρ¯i + φi (x) φ¯i = ρ¯i + φi (x) φi (x)dx (2.68) − − hi xi 1/2 Z − where φi is the actual double hyperbolic function given by

2 2 φ cihi 1 dihi 1 i (x)= 2 2 , (2.69) − a x αi − b x βi i − i − the parameters ci and di are

67 2 2 (ai 1) dR (bi 1) dL − − − ci = (2.70) (ai 2 + bi)(bi ai)  − −

di = dL ci (2.71) −

, and αi and βi are defined by (2.64 - 2.65). The parameters ai and bi are actually the same as

given in PDLM except with a value of q = 1. The bounds xi 1/2 and xi+1/2 again, represent − the left and right cell interface locations for cell i, and dL and dR represent the left and right interface derivatives approximates at second-order accuracy given in (2.34 - 2.35). There is no need to analyze the possibilities of singularities because they are exactly the same as with

PDLM. Also, the tolerance applied to the parameter ai is proven in AS06 to render this method

LTVB in the Eulerian Godunov context. The integrated mean of φi (x) for cell i is given by

xi+1/2 x α x β φ¯ 1 φ cihi i+1/2 i dihi i+1/2 i i = i (x)dx = 2 ln − + 2 ln − , (2.72) h x − a x α − b x β Z i 1/2 i  i 1/2 i  i  i 1/2 i  − − − − −

, and the integration the nearest Eulerian boundary up to a given Lagrangian boundary to that Lagrangian boundary is

x 2 α 2 β i∗ 1/2 ciˆhˆ xi∗ 1/2 iˆ diˆhˆ xi∗ 1/2 iˆ − ρ˜ ρ¯ φ¯ i − − i − − iˆ(x)dx = xi∗ 1/2 xiˆ 1/2 iˆ iˆ 2 ln 2 ln (2.73) x − − − a x α − b x β Z iˆ 1/2 − − iˆ iˆ 1/2 iˆ ! iˆ iˆ 1/2 iˆ ! −   − − − − 

where the integral bounds are interpreted the same as in (2.66) and (2.67). This is a more efficient calculation than with the PDLM because of the fact that the integral of ln(x) is x[ln(x) 1] where the integral of 1/x is simply ln(x). When integrating between bounds, − the PDHM expression is able to use half the logarithmic calculations by use of the identity: ln(b) ln(a)= ln(b/a). This cannot be done in the case of PDLM integration. Also, for −

68 4

3.5

3

2.5

2 Density

1.5

1

0.5

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x-axis

Figure 2.15: PDHM representation of irregular signal with 20 cells. reasoning stated in the PDLM section, absolute values are taken of the natural logarithm argu- ments. The approximations as stated in AS06 of PDHM are actually very similar to PDLM as can be seen in the plot of approximating the irregular signal with 20 cells in Fig. 2.15. However, there are two advantages of using PDHM instead of PDLM. First, there are fewer calculations involved because the logarithm terms may be combined. Second, the cancellation errors that were present in PDLM are not present in this particular scheme. Forthefirst cell of80withsine wave input, the values of the PDLM logarithmsbetween the two terms were identical in floating point representation. However, in PDHM, when the logarithm expressions are combined by the well-known identity of the difference of two logarithms, with exactly the same values for ai and bi, the two terms now differ. This leads to a confident assertion that it is the identical logarithm terms in PDLM that lead to the degradation in accuracy. To summarize the proces of PDHM reconstruction:

1. For all cells, calculate the left and right derivatives using (2.34) and (2.35) as with PHM (three-cell stencil).

69 2. For all cells, calculate the parameters a and b using (2.58-2.61) with a value of q = 1, and calculate c and d using (2.70-2.71).

3. For all cells, construct a unique double hyperbolic function, φ to describe the inner-cell distrubution using (2.69) and (2.64 - 2.65).

4. For all cells, calculate the integrated mean of the double hyperbolic function, φ¯, using (2.72).

5. The overall limited, conservative double hyperbolic function is now described by (2.68).

2.2.5 Piecewise Rational Method (PRM)

The PRM was developed in Xiao et al. (2002) (hereafter XEA02) in a SL context based on approximation via rational functions which to be more precise are based on the ratio of two parabolas. The PRM uses the cell interface values as with PPM instead of approximating the derivatives at the cell interfaces, and it is introduced in a context unlike the other four methods mentioned thus far. In PRM as introduced, the scheme is run just like any Eulerian flux-form finite-volume method. However, instead of recalculating the boundary values as is done in PPM, the boundary values are advected in a semi-Lagrangian manner. For the purposes of testing the piecewise rational functions in the context of the CCS, the scheme needs to be adapted in some manner to the same framework as the other schemes for the sake of meaningful comparison. The main complication in this is that the boundary values must be calculated at each time step by some manner. The most immediate option is to simply use the same conservative reconstruction technique used in PPM to gain the interface values for PRM. Because it is not known from XEA02 itself whether the PRM relies on conservative boundary values to ensure mass conservation, this is the route that is taken rendering this implementation

70 of PRM with a 4-cell stencil just as with PPM. The rational functions are defined by:

β 2 ai + 2bi x xi 1/2 + ibi x xi 1/2 ρ˜i (x)= − − − − (2.74) β 2 1 + i x xi 1/2  − −   where ai, bi, and βi are given by

ρ ai = i 1/2, (2.75) −

ρ ρ ¯i i 1/2 bi = βiρ¯i + − − , (2.76) ∆xi

ρ ρ ε 1 i 1/2 ¯i + β − − 1 (2.77) i = ∆ ρ ρ ε . xi ¯i i+1/2 + − ! − ρ ρ In these equations, xi 1/2, i 1/2, and i+1/2 are the left cell interface location, left cell in- − − terface value, and right cell interface value respectively, and ε can be set to machine epsilon precision solely for the purpose of avoiding a divide by zero. In the implementation for this

10 study, it was set to ε = 10− as sensitivity analysis showed virtually no affect on accuracy. There exist no discontinuities in the actual function itself as shown in XEA02 because β β the definition of i does not allow the expression 1 + i x xi 1/2 to reach the value zero − − within a given cell. As it turns out, in comparison to PPM, PRM has two main advantages. First, it needs no limiting because the function itself is implicitly much less oscillatory. Fig.

2.16 shows a PRM approximation to the same irregular signal problem with 20 cells as done for each of the previously mentioned methods to give an idea of what the rational functions look like. The representation seems to be closer to the PHM representation, and this is likely because a hyperbola is technically still just a rational function in the simplest case. Second, the PRM turns out to have a very simple solution to the integration owing to the fact that this

71 4

3.5

3

2.5

2 Density

1.5

1

0.5

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x-axis

Figure 2.16: PRM representation of irregular signal with 20 cells scheme is based on integrating from an Eulerian left boundary up to a given point, and the function itself is formed in the same way (referring to the repeated expression: x xi 1/2). The − − integration from the closest left Eulerian boundary to an arbitrary Lagrangian boundary up to that Lagrangian boundary using the same notation as in (2.67) is given by

δ δ xi∗ 1/2 x aiˆ+ biˆ x − ρ˜ (x)dx = (2.78) iˆ   xiˆ 1/2 βiδx + 1 Z − c c c where δx is defined as

c δ x = xi∗ 1/2 xiˆ 1/2. (2.79) − − −

This expression also has no possibilityc of singularity for the same reason as above owing to the definition of βi. To summarize the process of PRM reconstruction:

1. Calculate all of the cell interface values using the exact same preceesure as PPM (four-

72 cell stencil).

2. Calculate a, b, and β using (2.75-2.77).

3. The rational function is then defined by (2.74).

2.3 Constructing the Piecewise Parabolic Method - Hybrid

(PPM-H)

The PPM-H is conceived upon the fact that PPM has well-known deficiencies in its representa- tions of steep gradients which cause parabola overshoots violating monotonicity and local ex- trema which are monotonically constrained to be piecewise constant since it cannot be known how far above (below) the cell mean the true maximum (minimum) extends. Because PPM automatically tests for these conditions, creating an adaptive approximation scheme to “fill in these holes” is actually quite trivial; and this process will be described in this section. Ex- perimentally, it was found that PHM gives the best replacement for the PPM overshoots, and this is actually expected especially since PHM’s steepness can be tweaked with its exponent parameter in the power limiter. Therefore, PHM is indeed used to replace PPM in the event of this occurrence. The increase in accuracy due to replacing PPM at overshoot occurrences with PHM is much larger in magnitude than any replacement of extrema (see 2.3.2).

2.3.1 PHM Replacement at PPM Overshoots

This section describes the method or replacing PHM for occurrences of PPM overshoots. First, a simple replacement with PHM at all PPM overshoots is used. Performing a simulation in 1-D advecting a square wave for one cyclic revolution across 40 cells over 80 time steps, a major problem with monotonicity is uncovered as the solution exceeds the initial maximum

73 by 1.81 10 2 for a PHM power exponent of p = 3.999. With p = 3, the overshoot is only × − 1.99 10 5which is a far more manageable result, but maximum variation is desired whenever × − possible. Therefore, a value of p = 3 should be used for “severe” jumps and p = 3.999 should be used for “less severe” jumps to maintain the greatest amount of accuracy. This requires some measure of the “severity” of a jump which will be denoted as S. The severity of a jump is most generally characterized by anything that would cause a large overshoot since that is the quality in the solution which needs to be improved. In theory, a cell in which a steep jump occurs must be the cell in which the derivative magnitudes are very different. For instance, in Fig. 2.8, the main evidence that an overshoot is occurring is that the left interface derivative is much much smaller in magnitude than the right interface derivative. In terms of the PHM construction which already calculates the magnitudes of the derivatives, one can construct a trivial value to proxy the severity of a jump based on the lateral derivatives. However, suppose one transported variable has data from a magnitude of 0 to 100 and another from 0 to 1. Their derivatives will differ by a magnitude of order 100. Changing in grid spacing would also present this problem. For this reason, there needs to be some normalizer to keep the severity measures on the same order of magnitude no matter what the data or grid spacing. Therefore, utilizing the trivial property that the difference between two positive numbers may never be greater than their sum, the difference in lateral derivative magnitudes are simply normalized by the sum of the lateral derivative magnitudes. The magnitude of this normalized quantity is constrained between the values of zero and unity and serves as a robust measure of jump severity:

dL dR S = | |−| | (2.80) d + d L R | | | |

Empirically, it is found that the overshoots are reasonably well controlled if the following PHM power limiter exponents are used under the following conditions:

74 3 i f S 0.8 ≥ p =   (2.81)  3.999 i f S < 0.8 

With this particular limiter in place adapting to the severity of a jump, the resultant over- shoot for the aforementioned 40-cell square wave advection run is only 5.88 10 5, and the × − steepness is shown to be better constructed than if a power limiter exponent of p = 3 were used all the time. This is reflected in the global error norms wherein for a universal p = 3,

2 2 L1 = 7.96 10 and when using (2.81), L1 = 7.39 10 , and the only causes of error in × − × − the simulation are overshoots and resolving the steepness of the jump. Since the overshoot is worse for the adaptive exponent case yet the error is lower, clearly, the increase in accuracy is due solely to a better steepness preservation. It will be demonstrated in section 2.3.2.3 that these overshoots can be further reduced by using PHM to resolve only the extrema due to the slight overshoots.

2.3.2 Replacement Methods for Extrema

Next, the most natural solution to the treatment of local extrema is to use a method which handles extrema with an order of accuracy greater than first-order: either PDLM or PDHM.

Because of its superiority over PDLM in terms of floating point issues and the fact that both are very similar in their subgrid approximations, PDHM is tested in replacing PPM in the event of local extrema. It should be noted that it was found that because PHM may overshoot the neighboring cell values and lead to a bounded violation of monotonicity, it can create extrema where extrema did not previously exist. It is theorized that when coupled with the PDHM, this problem could be further complicated because it further violates monotonicity due to the sharp extrema that results. This would not happen with PPM’s natural extrema handling because PPM replaces extrema with a piecewise constant which is known to be extremely diffusive.

75 Because of this effect, a limiter is put in place to only employ PDHM in the case of “smooth enough” extrema. The sharpness of an extremum can be quantified as the sum of the absolute values of the left and right derivatives which may be limited as given in (2.82) where CE is the extrema sharpness cutoff. The reason this quantifies an extremum’s sharpness is because extrema by definition have derivatives of opposite sign. Therefore, the greater the sum of the magnitudes the sharper the extremum, and the more PDHM is liable to further violate monotonicity even to the point of degrading accuracy.

ρ ρ ρ ρ i+1/2 ¯ ¯ i 1/2 CE − + − − (2.82) ∆x /2 ∆x /2 ≤ 2 i i

Also, simply because PHM does in fact resolve local extrema with supposed 1.5-order accuracy (M94) which is formally higher-order than PPM’s first-order (piecewise constant) representa- tion, the literal error norm values are not always improved. This also is used with a sharpness

cutoff since it is well-demonstrated in this paper that PHM overshoots the most in the presence of sharp gradients (including extrema). Next, what is needed is an attempt to find an optimum combination of the PHM power limiter exponent and PDHM sharpness cutoff parameter for each test case to gain a good overall

understanding of the effects of this hybrid scheme’s performance. For a review of the test cases mentioned in here, see Section (2.4). Because we are still in the 1-D context, a brute force approach to finding this optimum combination is not infeasible, and in fact it is employed.

Therefore, 100 different PHM power exponents are tested spanning p [3,4) using equally ∈ spaced intervals and there are three different sets of PDHM/PHM extrema sharpness cutoff

values: 20 equally spaced ranging from CE [0,0.1), 20 equally spaced ranging from CE ∈ ∈ [0.1,1), and 20 equally spaced ranging from CE [1,10) (making 60 test values in all). This ∈ essentially corresponds to a 2-D data set with CE and p as the independent variables and a given

error norm as the dependent variable. The reasoning for the CE test grid is mainly because it is

76 unknown at what order of magnitude the most sensitivity will exist. Therefore, equal meshing is applied to each order of magnitude to be more certain of obtaining the interval of greatest sensitivity.

2.3.2.1 PDHM Replacement for Extrema

After viewing all of the 3-D surface plots, it has been determined that the L1 and L2 error norms have almost identical shape for every initial data test case. These norms are given priority because unlike the L∞ norm, L1 and L2 give an overall indication of accuracy across the grid. Also, since they are so similar, only L2 norms will be displayed in this study for the

purpose of illustrating comparative accuracies. For illustration purposes, the CE independent axis is displayed in a logarithmic scale because for differing initial data, it turns out that the

order of magnitude of the sharpness cutoff also differs. Fig. 2.17 shows the L2 error norms for the five different 1-D tests with 80 cells of resolution. The thing to notice in these figures is

that thee is no consistent lowering of L2 error as the sharpness cutoff is increased from zero. A sharpness cutoff value of zero effectively denotes that the PDHM method is never used and serves as the baseline for comparison. In general, for the higher PHM exponents (say, greater than 3.75), the more PDHM is used, the more the accuracy is degraded. The only exception to this is in the case of the triangle

wave, and the increase in accuracy due to PDHM’s use is modest enough not to compare to the degradation in the other cases. This basically suggests that in general, handling extrema with PDHM does not improve accuracy. The irregular signal profile is the exception to this rule in the case of lower PHM exponents. This is likely because the irregular signal contains a large number of extrema, especially at 80-cell resolution. In the presence of less severe overshoots,

PDHM’s accuracy at local extrema benefits the PPM-H. However, as mentioned earlier, the increase in accuracy due to using PDHM for extrema alone is far less in magnitude than the increase in accuracy due to using PHM for overshoots alone. Therefore, this evidence strongly

77 0.121 0.05 0.12 0.048 0.119 0.046 0.118 0.044 0.117 0.042 0.116 0.04 0.115 0.038 0.114 0.036 0.113 0.034 0.112 0.032 0.111 0.03

3 10 3 10 3.2 3.2 1 1 3.4 3.4 3.6 0.1 3.6 0.1 PHM Exponent Sharpness Cutoff PHM Exponent Sharpness Cutoff 3.8 3.8 0.01 0.01 4 4

(a) Square wave input (b) Triangle wave input

0.001 0.069 0.0009 0.068 0.0008 0.067 0.0007 0.066 0.065 0.0006 0.064 0.0005 0.063 0.0004 0.062 0.0003 0.061 0.0002 0.06

3 10 3 10 3.2 3.2 1 1 3.4 3.4 3.6 0.1 3.6 0.1 PHM Exponent Sharpness Cutoff PHM Exponent Sharpness Cutoff 3.8 3.8 0.01 0.01 4 4

(c) Sine wave input (d) Steep gradient input

0.065 0.064 0.063 0.062 0.061 0.06 0.059

3 10 3.2 1 3.4 3.6 0.1 PHM Exponent Sharpness Cutoff 3.8 0.01 4

(e) Irregular signal input

Figure 2.17: Surface and contour plot of L2 error norms for PPM-H using PDHM to resolve local extrema.

78 suggests that PDHM is not a good choice for replacing PPM in the presence of extrema, and it will thus not be used. Especially since PDHM is certainly more computationally intensive than PPM’s main loop, considering all aspects of a desirable scheme, this would be a poor choice.

2.3.2.2 PHM Replacement for Extrema

PHM’s 1.5-order accuracy in the presence of extrema may not be a bad choice for improving the solution. Fig. 2.18 shows surface and contour error plots over the same range of CE and p values to judge the effect of using PHM both for overshoots and in the presence of extrema. A sharpness cutoff value of zero effectively denotes that the PHM method is never used for extrema and serves as the baseline for comparison. Note that there is no consistent decrease in L2 error as sharpness cutoff and PHM exponent are varied. What is very interesting for each of these plots is that for each value of p and for each input type, there exists a range of

CE values which give optimal accuracy. One can also note that using PHM for extrema gives better accuracy in the optimal range than PDHM for all input types as well. One drawback is that for extremely smooth data such as the sine wave input, when using extremely high values of p, there exists a steep gradient in the CE direction in which the accuracy then degrades to worse than before. This is exactly opposite to the cases of non-smooth data such as the square wave and the steep gradient which both contain distinctive jump discontinuities. Regardless, there does not exist an optimum value (or even order of magnitude) for CE at which using PHM for extrema is ideal. Further research would be required to judge the meaning of these optimal ranges such that they could be calculated in the general case, but that is beyond the scope of this study. The evidence strongly suggests that there is no robust gain in accuracy in using a method other than PPM for local extrema. Therefore, local extrema are still resolved using the

PPM.

79 0.114 0.05 0.048 0.113 0.046 0.112 0.044 0.042 0.111 0.04 0.038 0.11 0.036 0.109 0.034 0.032 0.108 0.03

3 10 3 10 3.2 3.2 1 1 3.4 3.4 3.6 0.1 3.6 0.1 PHM Exponent Sharpness Cutoff PHM Exponent Sharpness Cutoff 3.8 3.8 0.01 0.01 4 4

(a) Square wave input (b) Triangle wave input

0.001 0.062 0.0009 0.061 0.0008 0.06 0.0007 0.059 0.0006 0.058 0.0005 0.057 0.0004 0.056 0.0003 0.055 0.0002 0.054

3 10 3 10 3.2 3.2 1 1 3.4 3.4 3.6 0.1 3.6 0.1 PHM Exponent Sharpness Cutoff PHM Exponent Sharpness Cutoff 3.8 3.8 0.01 0.01 4 4

(c) Sine wave input (d) Steep gradient input

0.07 0.068 0.066 0.064 0.062 0.06 0.058 0.056

3 10 3.2 1 3.4 3.6 0.1 PHM Exponent Sharpness Cutoff 3.8 0.01 4

(e) Irregular signal input

Figure 2.18: Surface and contour plot of L2 error norms for PPM-H using PHM to resolve local extrema.

80 2.3.2.3 Adaptive Use of PHM for New Extrema

Though adaptive replacement at extrema has so far been deemed unuseful for the general case, one observation warrants a hypothesis and experimentation. It is observed that when replacing

PHM for steep jumps with a power limiter exponent of p = 3.999, the overshoot magnitude when advecting a square wave cyclically for one revolution over 40 grid cells and 80 time steps was 1.81 10 2. The same experiment when run with plain PHM with the same power × − limiter exponent value yielded an overshoot of only 6.96 10 6. This led to a hypothesis that × − PHM has a natural mechanism of limiting its own overshoots. This hypothesis has a logical basis when considering the construction of PHM, the general derivative characteristics of new extrema created by overshoots, and the fact that hyperbolas are monotonic functions. This mechanism is easiest to explain visually, and there are two situations which demon- strate any extrema that can form due to overshoots the schematics of which are shown in Fig.

2.19. Consider cell i in Fig. 2.19a. PHM will approximate the left derivative because it has the smallest magnitude. Because this derivative is negative and hyperbolas cannot change sign in derivative being monotonic functions, the entire function must be negatively sloped. In order to preserve the integrated cell mass, the hyperbola must be shaped similar to the green dashed line in cell i. Similar logic follows for cell i + 1. If one were to integrate part of cell i 1 and − part of cell i, the result would reduce the magnitude of the overshoot. Similarly, if one were to integrate part of cell i + 1 and part of cell i + 2, the result would reduce the magnitude of the undershoot. Fig. 2.19b shows the hyperbolic fittings for the reverse case as well, but still, the magnitude of the overshoots and undershoots are always reduced. The fact that the centered derivative of the hyperbola is opposite in sign to the true centered derivative for extremum cells is why the accuracy degenerates at extrema. This gives strong evidence that using PHM for the overshoot-induced extrema would act to reduce extrema even further than the adaptive PHM exponent alone. For confirmation of these idealized hyperbolic reconstructions, see the sixth

81 Density Density

x−axis x−axis i−1 i i+1 i+2 i−1i i+1 i+2

(a) Jump from large to small (b) Jump from small to large

Figure 2.19: Exaggerated schematics of new extrema created because of overshoots at discon- tinuous jumps. The black boxes represent the mean within a cell, the dashed red lines represent the derivatives at the interfaces between boxes across the jump, and the blue dashed lines show a schematic of the approximate hyperbolic fitting for cells i and i + 1. The wind is assumed to be blowing in the negative x direction. cell from the left domain boundary of Fig. 2.10b which corresponds to cell i of Fig. 2.19a. Because of PHM’s natural overshoot damping when used for extrema created by the over- shoots, it is desirable to use PHM only for those extrema. This adaptive algorithm turns out to be very similar to the one used in section 2.3.1. Namely, the new extrema can be easily detected by the large difference in derivative magnitudes at the left and right cell interfaces.

Because the overshoot is created by a jump, it is necessary that one derivative be very large in magnitude. Because the overshoot is always small in comparison to the jump itself, the derivative on the opposite interface is necessarily much smaller in magnitude. Therefore, using the same “severity” measure of (2.80), the new extrema can be successfully separated from

the old. Empirically, it is found that the best damping of overshoots combined with the least degradation of accuracy occurs when PHM is used for all extrema such that:

82 S 0.95 (2.83) ≥

Studying the same test case as mentioned before advecting a 1-D rectangle wave once across a domain of 40 grid cells and 80 time steps, using PHM for extrema when (2.83) is

2 satisfied yields an overshoot of magnitude zero with L1 = 7.39 10 . Therefore, with this × − adaptive use of PHM for extrema, the overshoots are completely removed in the final solution with the error slightly improved.

2.3.3 Computational Intercomparison

This study’s focus is not simply on accuracy but also on efficiency with takes into account accuracy, speed, and parallelism. The advantage of the smaller stencil for PHM, PDHM, and PDLM has already been mentioned in the fact that less communication is necessary in parallel implementation, and the accuracy is to be reported with the results. For now, however, it would be beneficial to gain an operation count for the various methods to gain an idea of how complex each of them is. Table 2.2 shows the operation counts for the construction of each approximating function except for PPM-H since it is a compound construction. Table 2.3 shows the operation counts for the integration of each approximating function again excluding

PPM-H. To gain an idea of the performance of PPM-H when it is replaced due to an overshoot, the construction of PHM when it is used must be summed to the construction of the boundary values for PPM (listed in Table ) since this must be performed to test for overshoots. The operation counts are optimized such that any integer exponent less than two was con- verted to a multiplication procedure. Temporary variables are not generally created for the sake of saving addition and subtraction operations because it seems that the consistent stack allocation and assignment of extra space would likely not be much more efficient than the addition and subtraction unless they an operation is repeated often. This increases the num-

83 Table 2.2: Operation counts for the construction of the approximation functions. Bound refers to the operation counts in the reconstruction of boundaries. Method +, / exp min,max i f log abs sqrt sign − ∗ PPM 11 15 0 0 0 3 0 0 0 0 PHM 11 8 5 1 2 2 0 4 1 1 PDLM 28 18 9 2 0 0 4 6 0 0 PDHM 24 22 11 0 0 0 2 4 0 0 PRM 6 6 3 0 0 0 0 2 0 0 Bounds 21 26 13 0 2 2 0 6 0 2

Table 2.3: Operation counts for one integration procedure. a/b in the PHM row gives the operation counts: (if alpha < tol) / (if alpha > tol). Method +, / exp min,max i f log abs sign − ∗ PPM 6 11 1 0 0 0 0 0 0 PHM 9 7 3 0 0 1 2 2 0 PDLM 17 8 4 0 0 0 4 4 0 PDHM 12 12 6 0 0 0 2 2 0 PRM 3 3 1 0 0 0 0 0 0 ber of operations for the boundary value reconstruction, but it should be noted that especially with modern vector optimizations, addition and subtraction does not take up too significant an amount of time. What is most astounding in the tables is indeed how efficient PRM is with only a small fraction of the operation counts as the other methods. With general boundary value reconstruction (as in irregular grid spacing), this saving is not as notable because it is somewhat drowned out by the much more expensive boundary reconstruction. However, in the zonal cascade sweep, the grid spacing is actually uniform for spherical application. When grid spacing is the same, the boundary value reconstruction operation counts only consist of two multiplications and three additions in which case PRM would constitute a more significant saving over PPM. PHM is roughly comparable to PPM because of the need to evaluate the logarithms. When the grid consists of mostly constant data, PHM has shown to be much faster because of the linear approximation. However, this rarely happens in a true application. As can be seen,

84 PDLM and PDHM are more expensive than both PPM and PHM, and their accuracies would have to be significantly superior for them to be considered as the more efficient scheme. Given that AS06 mentioned the great similarity between PDLM and PDHM which is also shown in Fig. 2.13 and Fig. 2.15, it is very unlikely that PDLM will constitute a desirable method more- so that PDHM (which is much more efficient in the CCS SL framework). In summary, in terms of computational efficiency, PHM is most comparable to PPM, and PRM shows potential for quite a significant improvement in speed especially with no need for limiting.

2.4 Advection Test Cases

For each of the frameworks (1-D, 2-D Cartesian, and 2-D Spherical), the advection test cases consist of the advecting wind and the profile being advected wherein a specific test case may be any combination of the two categories. The advecting wind is either a uniform advection, solid body rotation, or deformational flow, but the actual data being advected may vary widely as many options were implemented in this study. This section describes the mathematical formu- lation of each of the advecting winds and the initial profiles of density for all three frameworks (as the interpretation of the advecting winds indeed differs between 2-D Cartesian and 2-D Spherical contexts). It should be noted that though many of the advecting wind cases are not uniform in space, they are all constant in time and thus require only one calculation before the start of the simulation. Also, it is not the purpose of this study to test the performance of a particular trajectory scheme, and it is desirable to be entirely rid of any errors due to this. Therefore, all departure locations for different advecting wind cases are calculated analytically. Thus, all error introduced into the solution is a product of the CCS framework and a particu- lar spatial discretization used with it. Since all new spatial approximations use the same CCS framework, the comparison is rendered a fair and meaningful one.

85 2.4.1 1-D Test Cases

For all of the 1-D test cases, the transport is a trivial uniform flow in which case one can simply calculate the backward trajectory as: x = xi u¯∆t where x is the departure location i∗ − i∗ of the Eulerian location, xi. When implemented, the boundaries (xi 1/2 and xi+1/2) are the true − values that are being trace upstream. Again, in a realistic framework with nonuniform wind, the backward trajectories are traced with the flow requiring additional computation.

2.4.1.1 Initial Data

Five test cases in all were used for the 1-D framework. First, a square (SQ) wave (Xiao et al.,

2002; Carpenter et al., 1990; Lin and Rood, 1996; Laprise and Plante, 1995; Chlond, 1994; Marquina, 1994; and Serna, 2006) is used as initial data such that the functional initialization is given trivially by (2.84) where D is the length of the domain. This initialization basically centers on the domain a rectangle of height unity and a length half the size of the domain.

The square wave is a good test of diffusiveness because inevitably over time, a scheme will eventually create a Gaussian bump out of it. This test gives a solid measure of how resistant a scheme is to this diffusion. In reality, this could easily correspond to a density front which usually exists with quite a sharp boundary.

D D 1 i f xi − 2 ≤ 4   ρ0 (xi)= OR (2.84)      D D  0 i f xi 2 > 4  −    The second type of initial conditions is a triangle (TR) wave (Chlond, 1994; Carpenter et al., 1990; and Xiao et al., 2002) given by (2.85). This centers across the domain a triangle with a base length one-fourth the size of the domain and a height of unity. The triangle wave is a less severe contact discontinuity both at the top point and the ends of the base which for a

86 non-positive-definite scheme will lead to undershoots and likely to Gibb’s phenomenon.

8 D D D 1 xi i f xi − D − 2 − 2 ≤ 8   ρ0 (xi)= OR (2.85)      D D  0 i f xi > − 2 8     Thirdly, a sine (SN) wave is introduced as initial condition s according  to (2.86) which is gives one wavelength across the domain normalized between zero and unity. This is an essential test because it is widely known that PPM as a result of clipping extrema with a piecewise constant flattens the top of any smooth function erroneously. This effect can be quite extreme, in fact; thus it is desirable to test for this effect in the non-polynomial methods.

2π sin xi + 1 ρ (x )= D (2.86) 0 i 2  The fourth and fifth initial conditions are taken from Zerroukat et al. (2002), and both come from the same equation based on an input array, (2.87), with different inputs. In this equation,

ξi = xi/D to normalize the coordinates from zero to unity as is required by the function. This is done for sake of robustness so that the domain size may be tailored however the user wishes without need to modify the code. First, there is the steep gradient (SG) profile which roughly consists of two square waves except that the discontinuity is less severe. Second, there is the irregular signal (IS) profile which contains many local extrema and presents a challenging problem for any advection scheme because of the implicit diffusiveness of discrete representa- tion. The input arrays for both of these initial profiles is given in Table 2.4. A plot of each of these functions is given in Fig. 2.20 on a grid of 640 cells.

ρ0 (ξi) = tanh[C1 (ξi C2)] + tanh[C3 (ξi C4)] [1 +C5 sin(2πC6ξi)][1 +C7 sin(2πC7ξi C9)] { − − } − (2.87) +tanh[C10 (ξi C11)] + tanh[ C10 (ξi C12)] +C13 − − −

87 Table 2.4: Array input values for (2.87) for the steep gradient (SG) and irregular signal (IS) initialization profiles Input C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 SG 200 0.1 -200 0.7 0 0 0 0 0 100 0.3 0.5 0 IS 10 0.3 -20 0.6 0.3 11 0.4 10 0.5 200 0.1 0.3 1

2.4.2 2-D Cartesian Test Cases

2.4.2.1 Transport

In the 2-D Cartesian framework, two different types of transport are tested: uniform translation and solid body rotation. The first case is exactly the same as the previous case of uniform advection except extended into two dimensions such that the upstream departure points are determined by: x ,y = xi u¯∆t,y j v¯∆t where x ,y represent the coordinates of the i∗ ∗j − − i∗ ∗j     departure location of the Eulerian point, xi,y j . Just as mentioned before, the implementation

actually traces the boundary points, xi 1/2,y j 1/2 , to their upstream departure locations. This ± ± will be true for the solid body rotation case as well. Solid body rotation basically describes perfectly rotational flow with a uniform angular ve- locity across the domain and usually rotating about the domain center (which is the case in this study). The formulation of solid body rotation is actually quite simple when the concept of uniform angular velocity is utilized properly such that no matter where the location on the grid, the angle about the center is displaced the same amount. Therefore, the first step in calculating

the departure points analytically is to obtain the angle of the current point relative to the center

of the domain, given in (2.88) where (xM,yM) represents the domain center. To find the depar-

ture location, the angle must simple be rotated upstream (again, constant angular velocity, ω0) as in (2.89). Finally, the coordinates may be transformed back to find the departure location as

in (2.90) using the radius which is calculated in (2.91).

88 4

3.5

3

2.5

2 density

1.5

1

0.5

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x-axis

Figure 2.20: Plots of the initial conditions. The square wave is in black, the triangle wave is in green, the sine wave is in dark blue, the steep gradient is in violet, and the irregular signal is in light blue.

89 1 y j yM θi, j = tan− − (2.88) xi xM  − 

θ ∗ = θi j ω0∆t (2.89) i, j , −

θ θ xi∗,y∗j = ri, j cos i∗, j + xM,ri, j sin i∗, j + yM (2.90)  

2 2 ri j = (xi xM) + y j yM (2.91) , − − q  Because analytical solutions are needed for error calculations, the object is always advected back to its original location, and that is compared against the initial data. Therefore, with uniform advection, it is cyclically brought back to the same location, and with solid body rotation, a certain number of full rotations are performed.

2.4.2.2 Initial Data

The first test case introduced to the Cartesian framework is widely considered to be a formidably difficult case involving the advection of a slotted cylinder (Zerroukat et al., 2002; ?; Zerroukat et al., 2007; Carpenter et al., 1990; and Nair et al., 1999). The equations used in this study

for the slotted cylinder problem are constructed as in Zerroukat et al. (2002) with the initial conditions given by () where D is the domain length.

1 i f ri j σ and ( ξi sw/2) or ζ j sl σ , ≤ | |≥ ≥ − ρ ξ ζ   0 i, j =  OR     (2.92)      0 Otherwise       90 σ = 0.25D, sw = 0.06D, sl = 0.25D, γ = 0.25D

2 2 ξi = xi xM + γ, ,ζ j = y j yM, ri j = ξ + ζ − − , i j q Secondly, a cosine hill also taken from Zerroukat et al. (2002) is used to test a schemes ability to retain the shape of a smoother function though it is still only quasi-smooth. This data is

formed within a cosine function with an argument that ranges from zero to π as the radius extends from the hill center outward to the maximum radius. The hill is normalized between

zero and 100 and given by (2.93) with the same definition for radius as above and σ = D/8.

100 1 + cos πri j/σ i f ri j σ , , ≤ ρ   0 ri, j =   OR   (2.93)      0 if Otherwise     The next function used for 2-D Cartesian initialization is one wavelength of a 2-D sine wave which was not found in any prominent literature on the subject. This is really to test the uniform advection of a very smooth data. This function is given in (2.94).

2π 2π sin xi + sin yi + 2 ρ x ,y = D D (2.94) 0 i j 4    Finally, the Leveque data (Leveque, 2002) consisting of a rectangular block and a cone are used

as given by (2.95). To use this particular formulation, the grid must be of the domain, x [0,2], ∈ y [0,2]. To accommodate a different grid, it would be simplest to normalize the coordinates ∈ within this range.

91 1 i f (1.1 < xi < 1.6) and 0.75 < y j < 1.25     OR    ρ   0 xi,y j =  1 ri, j/0.35 i f r < 0.35  (2.95)  −     OR      0 if Otherwise       

2 2 ri j = (xi 0.55) + y j 1 , − − q  Fig. 2.21 contains a plot of all of these initial data profiles. Note that the Leveque data is constrained to a domain while the rest are not and is displayed with different coordinates. The domain coordinates chosen here are for the purpose of demonstration as they may change

throughout the test cases for comparison to results in other literature.

2.4.3 2-D Spherical Test Cases

2.4.3.1 Transport

In the 2-D spherical framework, there are two different types of flow tested: solid body rotation and deformational flow. Solid body rotation in the context of spherical advection consists of a 3-D rotation about the Earth’s center along the surface of the sphere. As previously mentioned, the departure points need to be calculated analytically. In order to do this, the most simple case

is first described in which the flow is entirely equatorial (rotating about the Earth’s axis). In spherical coordinates, this means that the angular velocity in the zonal direction is constant. Thus, to calculate the departure location, the latitude, θ, remains the same, and the longitude, λ λ changes by a uniform constant. This means that the departure location is simply: i∗ =

λi ωλ ∆t in which λ ,θ j is the departure location of λi,θ j and ωλ is the uniform angular − i∗   92 1.0

0.8 80

Density 0.6 60

Density

0.4 80 40 80

0.2 60 20 60 40 40 0.0 y−axis 0 y−axis 20 20 20 20 40 60 40 60 x−axis 80 x−axis 80

(a) Slotted cylinder (b) Cosine cone

1.0

0.8 0.8

Density 0.6 Density 0.6

0.4 0.4 80 1.5 60 0.2 0.2 1.0 40 y−axis 0.0 y−axis 20 0.5 20 0.5 40 60 1.0 x−axis 80 x−axis 1.5

(c) 2D sine wave (d) Leveque

Figure 2.21: Initial profiles for 2-D Cartesian framework

93 velocity in the zonal direction. Clearly this case is far too restrictive as advection over the poles is necessary to reveal a scheme’s difficulties with the polar singularities, and this advection must be generalized to allow for flow around the Earth at any desired angle instead of simply equatorial flow.

In order to accomplish this generalization, equations (2.20 - 2.23) are employed, and this equatorial departure calculation is performed on a rotated sphere and then the rotated depar- ture points are transformed back into unrotated coordinates. With the location of the rotated sphere’s north pole (λp,θp) defined, the coordinates of a given point are transformed into ro- tated spherical coordinates, (λ ′,θ ′). Then, the constant angular velocity is added to the rotated longitude such that the departure location in rotated coordinates is (λ ωλ ∆t,θ ). Finally, ′ − ′ the rotated departure coordinates are transformed back into unrotated departure coordinates

(λ ∗,θ ∗). The location of the north pole of the rotated sphere may be defined with a single pa- rameter αr such that (λp,θp)=(0,π/2 + α). Thus, αr = π/2 would yield a poleward (merid-

ionally oriented) solid body rotation, and αr = 0 would yield equatorial flow in the positive zonal direction. It turns out that the case of deformational flow is almost identical to the solid body rotation case except that the angular velocity is not uniform. In fact, the rotation increases toward the poles and is zero at the equator such that two opposing vortices exist: one centered about the North Pole and one centered about the South Pole, and the flow is always zonal. This also may be generalized in exactly the same manner as with the solid body rotation by applying it to a rotated spherical coordinate system. The rotated zonal angular velocity as a function of rotated latitude is defined in (2.96) where r j = cosθ ′.

0 i f r j = 0 ω   r =  OR  (2.96)    3√3 tan(r j)  2 2 i f r j = 0 r j cosh (r j) 6       94 2.4.3.2 Initial Data

There are only two types of initial data imposed on the spherical geometry framework: a cosine hill for solid body advection experiments and a deformational flow setup for the deformational

flow experiments. The cosine hill is very similar to the Cartesian counterpart except that the radius is based on the great circle distance from the cosine cone center. It is given by (2.97) where rmax = 7π/64, λC = 3π/2, θC = 0, and ri, j is defined by (2.98).

0 i f ri, j > rmax   ρ0 ri j = OR (2.97) ,      1 ri, j   1 + cos π i f ri, j rmax 2 rmax ≤    h  i    1 ri j = cos− sinθC sinθ j + cosθC cosθ j cos(λi λC) (2.98) , −   The deformational flow setup (for instance, preparing for a polar vortex simulation) essentially places a certain gradient of density across the poles given by the equation (2.99) in terms of rotated spherically coordinates (typically rotated the same as the deformational flow itself) where the parameter γ controls the sharpness of the gradient. A value of γ = 0.01 would

correspond to a case of non-smooth deformational flow, and a value of γ = 5 would be a case of very smooth deformational flow.

cosθ ′ ρ0 λ ′,θ ′ = 1 tanh 3 sinλ ′ (2.99) i j − γ    In order to judge the error of the cosine cone for spherical solid body rotation, the cone is simply rotated back to its initial position, and the initial data is used for error calculations. For the deformational flow, however, the analytical field in terms of rotated spherical coordinates

at any given point in time is given by (2.100) where ωr is as defined by (2.99). Plots of these two initializations are given in Fig. 2.22. As can be seen by Fig. 2.22a-b, a zonal velocity

95 increasing toward the poles will yield a vortex as time increases. The seeming asymmetry of the deformational flow plots is a product of rotating the

cosθ ′ ρ λ ′,θ ′,t = 1 tanh 3 sin λ ′ ωrt (2.100) i j − γ −    

2.5 The Error Norms

There are six error norms employed in this study; and though they are all commonly used, it seems advantageous to gain an understanding of what particular characteristics of the error each of them extract. All error norms used here are applied to the global domain, and they are:

L1, L2, L∞, Lmin, Lmax, and the Root Mean Squared Error (RMSE). The error norms are defined as follows:

I ( ρ¯ num ρ¯ an ) L = | − | (2.101) 1 I ( ρ¯ an ) | |

I ρ¯ num ρ¯ an 2 | − | L2 = (2.102)  I ρ¯ an 2  | |   max( ρ¯ num ρ¯ an ) L∞ = | − | (2.103) max( ρ¯ an ) | |

min(ρ¯ num) min(ρ¯ an) Lmin = − (2.104) max(ρ¯ an) min(ρ¯ an) −

max(ρ¯ num) max(ρ¯ an) L = − (2.105) max max(ρ¯ an) min(ρ¯ an) −

96 (a) Smooth deformational flow setup, γ = 5, (North (b) Non-smooth deformational flow setup, γ = 0.01, pole view). Rotated such that λp = π + 0.025. (North pole view). Rotated such that λp = π + 0.025.

(c) Cosine cone

Figure 2.22: Contour plots of initializations for spherical geometry.

97 1 2 RMSE = ∑ ∑ ρ¯ num ρ¯ an (2.106) vn n i j i j u x y i " j − # u   t where the function I (x)is the global numerical mass integral defined as:

I (x)= ∑ ∑ xi jAi j (2.107) i " j #  where Ai j is the area of the cell with coordinates xi,y j in Cartesian coordinates or λi, µ j in

spherical coordinates. Of course, in 1-D, ny = 1 and j = 1 .  { } The L norms defined by (2.101 - 2.105) have all of the relevant information in the numer- − ators, and the denominators simply serve as a normalizer. Probably the most straightforward

measure of global error in terms of mass is the L1 norm which simply integrates the global absolute mass error and divides by the global analytical mass to normalize the quantity. This

means that if L1 > 1, the absolute mass error is more than the amount of mass itself which is

clearly a bad result for any experiment. L2 is similar to L1 except that in L2 space, the larger er-

rors are given more weight in the overall calculation. The L2 value is still normalized, but with

the square term, larger errors are weighted more than smaller errors. L∞ is not a global inte- gral but rather gives the largest absolute error on the domain normalized by the largest original

value on the domain. In practice, L2 behaves very similarly to L∞ when L∞ is large because L2

weights the larger values more, but L2 still takes the smaller errors into account some. Lmin is a very useful measure giving the difference in minimum values between the analytical solution and the numerical solution normalized by the range of values within the analytical solution. This is useful when determining the largest magnitude of undershoots in a scheme. Similarly

Lmax > 0 gives a measure of the worst overshoot caused in a scheme due to monotonicity vi-

olation. If the minimum value of the analytical data is zero, Lmin < 0 denotes a violation of positive definiteness which can lead to serious problems for positive definite quantities. RMSE

98 is a very common error measure for operational numerical models for weather prediction, and it is most similar to L2 because it is based on the squared error and takes larger errors into account more. However, RMSE is indeed an average measure and not a a normalized measure.

Therefore, if data of average value O(1) were multiplied uniformly by 10, the RMSE of the resultant data would be one order of magnitude larger than the original data. Just to bring this to the reader’s attention, there does exist dispute about the viability of squared measures of error; specifically, the RMSE error measure and its ability to unambigu- ously convey the model performance. According to Willmott and Matsuura (2005) and Will- mott and Matsuura (2006), the RMSE is not just a function of average error magnitude but also of the variability within the error distribution and the number of error measures. This makes sense based on what was said earlier because large variability in the error distribution would cause the larger errors to dwarf the small errors via implicitly unequal weighting, and the RMSE changes non-linearly with the average absolute error increase. This could be ex- panded to the L2 because though it is not a function of the number of error measures, it is indeed a function of the variability within the error distribution and not only of the mean error and data magnitudes. Therefore, the only unambiguous and direct measure of absolute error in this study is the L1 norm which is a normalized measure of absolute error relative to the original data magnitudes.

2.6 Compraison with a Modern Scheme

It would be useful to gain some idea of the performance and behavior of these schemes relative to a more modern state of the art scheme being used in practice. For this purpose, this study will implement the WRF (Weather Research and Forcast) model transport schemes with uniform advection in an Eulerian framework. In WRF, advection is performed by interpolating an a value for the transported variable at a cell interface and multiplying that by the wind to obtain

99 the flux through an interface. This is the flux for both neighboring cells and thus renders the transport schemes conservative. In the present uniform wind case, it is trivial to obtain the flux since the wind may be multiplied by a constant value. If we consider a generic left cell ρnth interface value calculation, i 1/2, for density with accuracy of order n for cell i, the second-, − fourth-, and sixth-order formulas are as follows:

ρ2nd 1 ρ ρ i 1/2 = ( i + i 1) (2.108) − 2 −

ρ4th 7 ρ ρ 1 ρ ρ i 1/2 = ( i + i 1) ( i+1 + i 2) (2.109) − 12 − − 12 − ρ6th 37 ρ ρ 2 ρ ρ 1 ρ ρ i 1/2 = ( i + i 1) ( i+1 + i 2)+ ( i+2 + i 3) (2.110) − 60 − − 15 − 60 −

The first-, third-, and fifth-order formulas represent upwind perturbations of the previous such that: ρ1st ρ2nd sign(u¯) ρ ρ i 1/2 = i 1/2 ( i i 1) (2.111) − − − 2 − −

ρ3rd ρ4th sign(u¯) ρ ρ ρ ρ i 1/2 = i 1/2 + [( i+1 i 2) 3( i i 1)] (2.112) − − 12 − − − − −

ρ5th ρ6th sign(u¯) ρ ρ ρ ρ ρ ρ i 1/2 = i 1/2 [( i+2 i 3) 5( i+1 i 2)+ 10( i i 1)] (2.113) − − − 60 − − − − − − −

Three times per time step, these are evaluated in the Runge-Kutta (RK3) time integration scheme to achieve third-order accuracy in time and variable (first- through sixth-) order accu-

racy in space. The RK3 proceedure for a single time step from an arbitrary time N to time

N + 1 is: ∆ ρN+1/3 ρN u¯ t ρN ρN = i+1/2 i 1/2 (2.114) − ∆x − −   u¯∆t ρN+1/2 ρN ρN+1/3 ρN+1/3 (2.115) = ∆ i+1/2 i 1/2 − x − −   u¯∆t ρN+1 ρN ρN+1/2 ρN+1/2 (2.116) = ∆ i+1/2 i 1/2 − x − −  

100 It should be kept in mind that these approximationsare made within the Eulerian framework in solving the transport equations which is given in (2.3) and not the Lagrangian integrated framework.

101 Chapter 3

Results

3.1 Spatial Approximation Performances

3.1.1 1-D

3.1.1.1 Semi-Lagrangian CCS

All experiments performed in 1-D showed a relative mass change of O 10 14 which is what ± − would be expected in double precision arithmetic if a method were conservative.  The reason a relative measure is taken is because in some test cases to come in 2-D, the grid spacings will be larger which leads to more mass in general (and thus also more mass loss by magnitudes). To judge the mass loss relative to what mass existed to begin with (both of which are grid spacing dependent) will allow the measure to be independent of grid spacing. Also, all runs in this 1-D section are run with uniform wind of 1 m/s, a mesh of x (0,1) and a Courant number of 0.5. ∈ Basically, this implies that ∆t = 1/(2n), and there are 2n time steps where n is the number of cells. PPM in all of these runs if run with the monotonic limiter. Before presenting any error norms for the different methods, it is good to gain a feel for how methods perform for various test cases. For the sake of brevity, only plots of the square

102 wave, sine wave, and irregular signal are given to show the strengths and weaknesses of the different methods. Fig. 3.1 shows plots of all three of these methods compared against the actual averages that are being advected. Calculations have previously shown that PDLM and PDHM have very similar errors. Therefore, PDHM will be used, and PDLM will be discarded for the rest of the study because PDHM is the faster and arithmetically better-defined of the two. Also, note that a positivedefinite filter is not used in this section as the 1-D simulations are not the application but rather for the study and development of the new schemes. The positive definite filter will be applied in all of the 2-D cases. Fig. 3.1a was run with 100 cells (and thus, 200 time steps) is zoomed in on the left dis- continuity because the other is simply a mirror image of it, and all of the error in this test case occurs due to this gradient smoothing phenomenon. From this figure, it is clear that PPM-H ap- proximates the steepness of the discontinuity the best without overshooting the jump. Clearly the worst-performing method is the PDHM with the PHM not far from it. It may seem para- doxical that the PHM should provide so much benefit when spliced in at the discontinuity in PPM-H yet the PHM itself performs very poorly at the discontinuity. This is because though PHM approximates the jump itself well, it does not approximate the less extreme slopes of the later result as well as PPM. The curve which PPM eventually tends to relax to holds a steeper gradient than the curve the PHM relaxes to. Fig. 3.1b was run with 50 cells. Only the maximum is shown because in the sine wave test, this is where the vast majority of the error occurs. This test case reveals PPM’s tendency to clip extrema because of the first-order representation (piecewise constant) caused by the monotonicity limiter. As before, PRM is very similar to PPM but slightly less accurate. What is very impressive, however, is the ability of PPM-H to resolve the local maximum shown with almost perfect shape preservation, and PHM is not too far behind this result. This indicates that the errors for smoother functions will likely be better for PHM than for PPM which is a welcomed result. PDHM, as usual, produces an inferior representation of the sine wave by

103 1 Initial PPM PHM PDHM PRM 0.8 PPM-H

0.6 Density 0.4

0.2

0 0.2 0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 0.3 x-axis (a) Sine wave comparison (50 cells, 10 revolutions)

1

0.98

0.96

0.94

0.92 Initial

Density PPM PHM PDHM 0.9 PRM PPM-H

0.88

0.86

0.84 0.15 0.2 0.25 0.3 0.35 x-axis (b) Sine wave comparison (50 cells, 10 revolutions)

Initial PPM 3.5 PHM PDHM PRM PPM-H

3

2.5 Density

2

1.5

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 x-axis (c) Irregular signal comparison (50 cells, 1 revolution)

Figure 3.1: Comparison of 5 different spatial schemes. Note that the domains extend from 0 to 1 in all cases, but for plotting clarity, subsets are plotted. All are run with a uniform wind speed of 1 m/s and a Courant number of 0.5 (meaning ∆t = 1/(2n)).

104 Table 3.1: L2 error norms and orders of convergence for square wave. # Cells 20 40 80 160 320 640

L2 Ord. L2 Ord. L2 Ord. L2 Ord. L2 Ord. L2 Ord. PPM 0.19409 — 0.14964 1.30 0.11517 1.30 0.08811 1.31 0.06733 1.31 0.05154 1.31 PHM 0.22788 — 0.17836 1.28 0.13897 1.28 0.10792 1.29 0.08365 1.29 0.06472 1.29 PDHM 0.23693 — 0.18505 1.28 0.14383 1.29 0.11134 1.29 0.08598 1.29 0.06630 1.30 PRM 0.20114 — 0.15695 1.28 0.12140 1.29 0.09370 1.30 0.07213 1.30 0.05543 1.30 PPM-H 0.19075 — 0.14338 1.33 0.11151 1.29 0.08348 1.34 0.06448 1.29 0.04945 1.30 clipping the maximum even more than the rest of the schemes.

Fig. 3.1c was also run with 50 cells and gives an indication of how each method handles an irregular signal. PPM-H shows a very clear advantage to sharply changing data with large numbers of opposing extrema as it follows the profile better than any other method, mimicking the inflection of the original data the best. As the grid spacing decreases, it would be expected that PPM-H would show less of an advantage because the profile itself would become less sporadic and more smooth. This suggests that for particularly sporadic quantities, PPM-H would have an advantage. As with the other profiles, PRM is very similar to PPM, and this seems to be a fairly robust pattern. PRM seems to be only slightly less accurate that PPM but with a lower operation count. As has been the case before, PDHM smooths the data the most.

PHM seems to be somewhat more accurate that PDHM, but it still smooths the data more that PPM. Judging from the observations of these plots, it seems that in general, PHM excels in the representation of smooth data without causing the clipping observed by PPM in the presence of smooth extrema. However, PPM overall handles the steep gradients better than PHM. Again, this is not contradictory because the final result of PHM is not a direct reflection on its ability to handle jump discontinuities but on its accuracy on the whole. PPM-H handles all cases more accurately than PPM for the 1-D framework, and this is not only visually evident but can be seen from the errors in Table 3.1.

105 Table 3.2: L2 error norms and orders of convergence for triangle wave. # Cells 20 40 80 160 320 640

L2 Ord. L2 Ord. L2 Ord. L2 Ord. L2 Ord. L2 Ord. PPM 0.33322 — 0.11312 2.95 0.05526 2.05 0.02431 2.27 0.01055 2.30 0.00458 2.30 PHM 0.38937 — 0.11943 3.26 0.05474 2.18 0.02515 2.18 0.01135 2.22 0.00510 2.23 PDHM 0.45189 — 0.17646 2.56 0.08352 2.11 0.03934 2.12 0.01803 2.18 0.00816 2.21 PRM 0.33324 — 0.11057 3.01 0.05799 1.91 0.02606 2.23 0.01182 2.20 0.00535 2.21 PPM-H 0.28497 — 0.07951 3.58 0.03678 2.16 0.01520 2.42 0.00729 2.09 0.00307 2.38

Table 3.3: L2 error norms and orders of convergence for sine wave. # Cells 20 40 80 160 320 640

L2 Ord. L2 Ord. L2 Ord. L2 Ord. L2 Ord. L2 Ord.

PPM 6.60E-02 — 1.66E-02 3.98 3.91E-03 4.23 9.29E-04 4.21 2.13E-04 4.37 4.88E-05 4.36

PHM 5.90E-02 — 7.44E-03 7.93 1.05E-03 7.09 1.88E-04 5.60 4.43E-05 4.23 1.28E-05 3.48

PDHM 1.25E-01 — 2.86E-02 4.39 5.83E-03 4.90 1.14E-03 5.10 2.15E-04 5.32 3.93E-05 5.47

PRM 6.88E-02 — 2.01E-02 3.43 5.39E-03 3.72 1.44E-03 3.75 3.84E-04 3.74 1.03E-04 3.72

PPM-H 3.11E-02 — 5.71E-03 5.45 1.04E-03 5.51 3.08E-04 3.36 5.87E-05 5.25 7.28E-06 8.07

Table 3.4: L2 error norms and orders of convergence for steep gradient profile. # Cells 20 40 80 160 320 640

L2 Ord. L2 Ord. L2 Ord. L2 Ord. L2 Ord. L2 Ord. PPM 0.19528 — 0.11019 1.77 0.06248 1.76 0.02895 2.16 0.01033 2.80 0.00441 2.34 PHM 0.22139 — 0.13730 1.61 0.08792 1.56 0.04939 1.78 0.02387 2.07 0.00931 2.56 PDHM 0.23268 — 0.14794 1.57 0.09338 1.58 0.05290 1.77 0.02536 2.09 0.00944 2.69 PRM 0.19834 — 0.11702 1.70 0.06955 1.68 0.03463 2.01 0.01322 2.62 0.00338 3.91 PPM-H 0.19018 — 0.10459 1.82 0.05751 1.82 0.02580 2.23 0.00953 2.71 0.00450 2.12

Table 3.5: L2 error norms and orders of convergence for irregular signal profile. # Cells 20 40 80 160 320 640

L2 Ord. L2 Ord. L2 Ord. L2 Ord. L2 Ord. L2 Ord. PPM 0.14138 — 0.10860 1.30 0.06688 1.62 0.03369 1.98 0.01282 2.63 0.00380 3.37 PHM 0.16350 — 0.12416 1.32 0.08662 1.43 0.04791 1.81 0.02404 1.99 0.00914 2.63 PDHM 0.17183 — 0.12397 1.39 0.08974 1.38 0.05098 1.76 0.02695 1.89 0.01067 2.52 PRM 0.14893 — 0.11200 1.33 0.07145 1.57 0.03694 1.93 0.01582 2.34 0.00428 3.69 PPM-H 0.13566 — 0.10400 1.30 0.06100 1.70 0.02879 2.12 0.01051 2.74 0.00341 3.08

106 Tables 3.1-3.5 give the L2 errors and convergence orders over a large range of resolutions for the five different input types to give a more quantitative comparison of the different meth- ods. First, we will consider the square wave results in Table 3.1. What is most immediately obvious is that none of the methods showed an order much greater than 1.3. This might seem to contradict the higher-order principle, but in the presence of non-smooth data (especially when not even continuous in C0 space), a scheme will never achieve full accuracy. This is because the data itself is only first-order binary (0 or 1) data with no continuous derivative at the discon- tinuities. Since all error occurs at these discontinuities, it should not be expected that the error would converge very quickly compared to the case of smooth, infinitely differentiable data.

Clearly, PPM-H outperforms PPM for all resolutions. As for PHM, the error is 20% - 25% more than that of PPM which is roughly comparable at these magnitudes. PRM as would be expected is worse than PPM but not by much, say only 5% - 7.5% greater error. PDHM, also as expected, had the worst error of all of the methods but not to the point of being an outlier.

Moving the attention to the triangle wave errors in Table 3.2, one can see that the orders of convergence on the whole are much better, and this is certainly because the discontinuity is a contact discontinuity and not a jump, meaning that the data is at least C0 continuous. In this case, PPM-H shows very clear advantages over PPM in that PPM shows anywhere from

15-30% greater error, the greater errors being at the smaller grid spacings because PPM-H has a better order of convergence. PHM gives a much more comparable accuracy to PPM in this case with errors usually only 5-10% greater than PPM. As before, PRM is only a littlebit worse than PPM with a maximum of about 16%. In fact, on the whole, PHM is closer to PPM for the triangle wave approximation. In this case, PDHM is starting to become a bit of an outlier sometimes as much as nearly 80% worse that PPM, and it is highly unlikely that PDHM will have redeeming qualities in the next text cases as its visual representation was always inferior to the other methods. Next is possibly the smoothest of functions save only a field of constant value: the sine

107 wave, shown in Table 3.3. Note that the orders of convergence are much higher even than the methods themselves formally demonstrate. This is often the case with very smooth data. The most noticeable numbers on this graph are certainly the performance of PHM and PPM-H across the board in comparison to PPM. The PPM-H L2 error ranged from only 15-50% of the error PPM which is more than 6 times more accurate in some cases. PHM had anywhere from 20-90% of the error of PPM which translates into almost 5 times better than PPM at times. At no resolution was PPM better than PPM-H or PHM. What is very interesting here is that even PDHM exceeded the accuracy of PPM at higher resolutions because the PDHM convergence order steadily increased. This suggests that at high resolutions for very smooth functions, PDHM can be quite accurate. Now, we consider a case similar to that of the square wave though slightly less severe: the steep gradient profile shown in Table 3.4. In this test case, PPM-H generally gives the best errors though it performs slightly more poorly than PPM at the highest resolution here.

PRM as with the other fairly non-smooth functions seems to keep up with PPM fairly well. PDHM, again, has the poorest accuracy, and PHM ranges from 20% worse to at most twice as poor as PPM in errors. Considering finally, the irregular signal in Table 3.5, we see an erratic combination of sharp extrema, jump discontinuities, and the like. In this case, PPM-H consistently performs better than the other methods and generally converges more quickly in the error norm. PRM, again, is slightly worse than PPM, and PDHM performs the worst of all methods. PHM generally ranges from 15% to twice as much the error as PPM because of its poorer convergence for this case. Now, we sum up the characteristic performances of the different methods before applying them in more complicated frameworks using PPM as the baseline. PHM generally performs much better than PPM with smooth data but is generally worse with discontinuous or erratic data. With simple discontinuities, PHM is still comparable, but with extreme cases such as the irregular signal, PHM can become twice as bad as PPM for extremely fine resolutions.

108 PDHM seems to perform poorly relative to PPM on all accounts. PRM seems comparable to PPM for reasonably difficult cases, but for smooth function PRM actually degenerates to have the worst error for very fine resolutions. The hybrid method, PPM-H, seems to outperform PPM on all accounts even for discontinuous cases but especially for smooth functions. Also, the advantages of PPM-H are typically present at lower resolutions which are the resolutions typically run on a GCM. This means it will likely offer tangible advantages if implemented at current resolutions. Table 3.6 shows the relative speeds in computing a sine wave test case with 1,000 cells over 4,000 times steps (2 revolutions) to gain a true comparative efficiency measure. The code is compiled with all available optimizations (including the SSE2 vectorization instruction set) on the Intel Fortran 90 version 9.1.045 compiler for an Intel Core2 Duo 6400, 2.12GHz processor. The implementation matches the operation counts of Tables 2.2 and 2.3 with every attempt to optimize every scheme as much as possible by the use of temporary variables. The times themselves are an average of 10 runs for each method to obtain a reliable CPU count. The trajectory calculations are not included in the loop that is timed here because it can be done before the simulation and kept static. Therefore, these times largely represent computation due to the different schemes alone. Strangely, PDHM cannot be thought of as slower than

PHM because the standard deviation ranges about the means overlap. Logarithms, therefore, must not be a terribly expensive operation. All of the new non-polynomial methods show a slight improvement to PPM’s efficiency running in about 5% less time. With PPM-H, it seems that at least for the sine wave (in which PHM is activated reasonably often), the hybrid PHM replacement alters the time very little. What is notable, however, is that on a regular mesh for which the boundary value reconstruction for PPM and PRM is much more efficient, the computation times are about 40% lower. In this regime, in fact, PRM gives a 10% increase in speed. Regardless, the narrower stencils of the new methods compared to PPM’s 4-cell stencil would likely affect the efficiency much more than these speeds in a parallel model. Though

109 Table 3.6: 1-D 10-run average CPU time and standard deviations for 1,000 cell sine wave problem with 4,000 time steps (for 2 revolutions). Units in seconds. The suffix “Reg” means the scheme was run by with a regular mesh boundary value reconstruction (which is much more efficient). These were performed with intel fortran compiler options “-c -O3 -axT”. Scheme PPM PHM PDLM PDHM PRM PPM-H PPM Reg PRM Reg PPM-H Reg

Avg. Time 2.2834 2.1543 3.1854 2.1597 2.1458 2.2803 1.5172 1.3693 1.5177 St. Dev. 0.0176 0.0138 0.0348 0.0187 0.0184 0.0153 0.0112 0.0123 0.0220

Table 3.7: 1-D 10-run average CPU time and standard deviations for 1,000 cell sine wave problem with 4,000 time steps (for 2 revolutions). Units in seconds. The suffix “Reg” means the scheme was run by with a regular mesh boundary value reconstruction (which is much more efficient). These were performed with intel fortran compiler options “-c -fast”. Scheme PPM PHM PDLM PDHM PRM PPM-H PPM Reg PRM Reg PPM-H Reg

Avg. Time 1.9158 2.2192 2.6236 2.2473 1.8580 1.9110 1.384090 1.342896 1.390389 St. Dev. 0.0066 0.0092 0.0206 0.0061 0.0054 0.0040 0.003301 0.003665 0.006801

PPM’s regular mesh is 30% more efficient than the average non-polynomial scheme, this can only be performed in the zonal remappings in spherical application which is only half of PPM’s computational time. Thus, an effective 10% speed advantage of PPM from this effect is not likely to overcome the narrower stencil advantage of PHM and PDHM. Therefore, in terms of speed of computation in a parallel context, PHM and PDHM are likely to have the upper hand. After the previous results were obtained, it was found that changing compiler options changes the relative speeds of the methods quite a bit. Table 3.7 shows the same experiment performed with a different compiler option (denoted in the caption). Here, we can see that PPM increased in efficiency and that PHM decreased in efficiency. Therefore, given this high sensitivity to the particular compiler and the compiler optimizations used, the more appropri- ate conclusion is that PPM, PPM-H, and PHM are all on a similar order of magnitude as far as

CPU time is concerned. Additionally, there is a large increase in efficiency when the regular grid is used no matter what the optimizations are.

110 3.1.1.2 Eulerian WRF

Here, the results from the WRF simulations will be displayed and tabulated to give a measure of comparison between the SL schemes tested in the previous section and something widely used in current research. Figs. 3.2, 3.3, and 3.4 collect the same three runs performed in the previous section for comparison agaist Fig. 3.1 for WRF runs of third-, fourth-, and fifth-order interpolation of the cell interface density values. Other than solving the Eulerian equations and the RK-3 in time integration, the simulation parameters and initial data are identical to the previous section. For the sake of brevity, second- and sixth-order results are not graphically shown because they are typically not used for real simulations (neither is the fourth-order). This is because of the well-known tendency of spatially even-ordered Eulerian schemes to have large amplitude, small wavelength, poorly damped spatial noise in the solution. For convenience, a WRF scheme with third-, fourth-, and fifth-order advection will be referred to as WRF3, WRF4, and WRF5 respectively. It is fairly evident from the plot of the WRF4 solutions in Fig. 3.3 why that particular scheme is not typically used. While smooth data is resolved very well, the non-smooth data causes large noise which propagates out from the discontinuity into the entire domain, contaminating the solution. Comparing WRF3 with

PPM-H and PPM, it is clear that the better SL methods resolve both steepness and alternating local extrema better while remaining monotonic with no overshoots. WRF5 shows steepness resolution comparable to PPM-H except that WRF5 still results in fairly large overshoots. Also notable is that WRF5 resolves alternating local extrema (3.4c) slightly better than PPM-H as visible in the range x [0.3,0.55]. ∈ The WRF schemes perform the 4,000 time step 1,000 cell sine wave transport problem about six to eight times faster than the semi-Lagrangian PPM, PHM, and PPM-H schemes. This is quite significant as far as the speed of computation. However, it is quite difficult to compare these two in such a direct manner. First, steep data is certainly better resolved by the

111 Initial Third 1.2

1

0.8

0.6

Density 0.4

0.2

0

-0.2

0.2 0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 0.3 x-axis (a) Square wave comparison (100 cells, 1 revolution)

1.05 Initial Third

1

0.95 Density

0.9

0.85

0.1 0.15 0.2 0.25 0.3 0.35 0.4 x-axis (b) Sine wave comparison (50 cells, 10 revolutions)

4 Initial Third

3.5

3

2.5 Density

2

1.5

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 x-axis (c) Irregular signal comparison (50 cells, 1 revolution)

Figure 3.2: Plot of WRF spatially third-order results with Eulerian RK-3 integration in time. Note that the domains extend from 0 to 1 in all cases, but for plotting clarity, domain subsets are plotted. All are run with a uniform wind speed of 1 m/s and a Courant number of 0.5 (meaning ∆t = 1/(2n)).

112 Initial Fourth 1.2

1

0.8

0.6

Density 0.4

0.2

0

-0.2

0.2 0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 0.3 x-axis (a) Square wave comparison (100 cells, 1 revolution)

1.05 Initial Fourth

1

0.95 Density

0.9

0.85

0.1 0.15 0.2 0.25 0.3 0.35 0.4 x-axis (b) Sine wave comparison (50 cells, 10 revolutions)

4 Initial Fourth

3.5

3

2.5 Density

2

1.5

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 x-axis (c) Irregular signal comparison (50 cells, 1 revolution)

Figure 3.3: Plot of WRF spatially fourth-order results with Eulerian RK-3 integration in time. Note that the domains extend from 0 to 1 in all cases, but for plotting clarity, domain subsets are plotted. All are run with a uniform wind speed of 1 m/s and a Courant number of 0.5 (meaning ∆t = 1/(2n)).

113 Initial Fifth 1.2

1

0.8

0.6

Density 0.4

0.2

0

-0.2

0.2 0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 0.3 x-axis (a) Square wave comparison (100 cells, 1 revolution)

1.05 Initial Fifth

1

0.95 Density

0.9

0.85

0.1 0.15 0.2 0.25 0.3 0.35 0.4 x-axis (b) Sine wave comparison (50 cells, 10 revolutions)

4 Initial Fifth

3.5

3

2.5 Density

2

1.5

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 x-axis (c) Irregular signal comparison (50 cells, 1 revolution)

Figure 3.4: Plot of WRF spatially fifth-order results with Eulerian RK-3 integration in time. Note that the domains extend from 0 to 1 in all cases, but for plotting clarity, domain subsets are plotted. All are run with a uniform wind speed of 1 m/s and a Courant number of 0.5 (meaning ∆t = 1/(2n)).

114 PPM-H scheme than by the WRF scheme, and the semi-Lagrangian schemes can handle sig- nificantly larger time steps than the WRF schemes and remain stable. The real reason for the incompatibility of comparing running time is that the WRF schemes are by no means positive definite. They contain relatively large overshoots (more than 10% of the jump magnitude in some cases) which violate scalar positivity. The SL schemes on the other hand in the worst case may violate positivityby a tenth of a percent (for the PHM case). The hole filler algorithms can fix the positivity issues, but they result in artificial global transport of the scalar and typically artificial diffusion as well. This is kept orders of magnitude smaller by the SL schemes than by the WRF schemes. Additionally, for the maxima overshoots (which don’t violate positivity), there is no fix for schemes that oscillate heavily, and small-scale oscillations will inevitably oc- cur. This is yet another advantage of the SL schemes which is difficult to compare. If positivity, local conservation, and oscillations are not considered largely important to a particular appli- cation, than the WRF schemes would certainly be the best choice. Otherwise, monotonic (or very nearly monotonic) schemes should be used. This section does show the comparable accu- racy of the new SL schemes to WRF schemes, however, and in non-deformational flows, this accuracy does not diminish very quickly with increasing time step leading to good efficiency in transporting quantities which aren’t needed every time step.

3.1.2 2-D Cartesian

Knowing how the schemes tend to behave, they may now be brought more knowledgeably into a more complicated framework of 2-D Cartesian advection. The uniform advection mentioned earlier was used as a test case in the development of this framework, and it turns out that it adds practically no new information. On that basis, only the solid-body rotation cases will be mentioned in this section as some methods tended to perform differently in this case because of interaction with the cascading approach. In the solid body test cases, the boundaries are still

115 implemented as cyclic. However, this plays no role because none of the data tested actually reaches the boundaries except the 2-D sine wave data which was for testing the framework itself using uniform advection. The cyclic boundaries are really just for absolutely ensuring mass conservation. In fact, the relative mass change in all of the 2-D Cartesian runs was always less that O 10 15 whether the domain used a grid spacing of O(10) or O 105 . ± − The first test case will be the solid body rotation of a cosine hill which will be matched  to and compared against Zerroukat et al. (2002), hereafter ZWS02, and Zerroukat et al. (2005), hereafter ZWS05. These tests are also performed in Zerroukat et al. (2007), but the study used the parabolic spline method (PSM) which involves a global construction of each cell’s representation by matching extra information at the boundaries between two cells (such as derivatives) as well as the cell means (required for conservation). Because PSM is a global calculation, it really lies in a class other than these schemes which are local. Therefore, only ZWS02 and ZWS05 will be used for comparison, and because SLICE and SLICE-M utilize a

5- and 6-cell stencil respectively, they are, in fact, a difficult comparison for these new methods. The first test is the solid body rotation of a cosine hill, and the parameters of the experiment are given in the caption of Fig. 3.5. Fig. 3.5 shows plots of the cosine cone advection with the five different schemes with the same scaling for each. Table 3.8 also gives the error norms associated with this experiment and these plots. It is evident from the plots alone that PPM-H caused the least diffusion of all of the schemes, and PDHM cause the worst. What was unexpected was that PHM would perform so poorly compared to PPM with the smoother cosine hill function. This is either due to the the interaction between PHM’s handling of smooth functions and the cascade handling of solid body rotation or due to the small number of grid points on the domain making the cosine hill fairly non-smooth. To decide between the two, the simulation is run with 100 cells instead of 33, and it is found that PHM does indeed outperform PPM. Therefore, the small number of grid points is what causes the poor PHM performance in this case, and it behaves similarly to

116 100 100

80 80

60 60

Density Density

40 40

20 20

20 20 0 15 0 15 10 10 5 y−axis 5 y−axis 10 15 10 15 x−axis x−axis

(a) PPM (b) PHM

100 100

80 80

60 60

Density Density

40 40

20 20

20 20 0 15 0 15 10 10 5 y−axis 5 y−axis 10 15 10 15 x−axis x−axis

(c) PDHM (d) PRM

100

80

60

Density

40

20

20 0 15 10 5 y−axis 10 15 x−axis

(e) PPM-H

Figure 3.5: Cosine cone solid body rotation. Surface plotted after 1 rotation, nx = ny = 33, 5 2 5 1 nt = 71, Ω = 0,32 10 m , ωr = 10 s , ∆t = 2π∆x/nt. The x- and y- axes have the units × − − 105m. ×  

117 Table 3.8: Error norms for cosine hill solid body rotation experiment. SLICE and SLICE-M are defined by ZWS05 Method L1 L2 L∞ Lmin Lmax PPM 0.2160 0.2107 0.3283 0.0000 -0.3283 PHM 0.4333 0.3526 0.4360 0.0000 -0.4360 PDHM 0.4914 0.3879 0.4810 0.0000 -0.4810 PRM 0.2462 0.2267 0.3376 0.0000 -0.3376 PPM-H 0.1484 0.1503 0.2468 0.0000 -0.2468 SLICE 0.2252 0.1217 0.1034 -0.0307 -0.1061 SLICE-M 0.1531 0.1041 0.0910 0.0000 -0.0933 the 1-D cases. What is notable is that PPM-H has only about 70% of the error of PPM in the

L2 norm which shows a significant improvement. Compared to the 5-cell SLICE algorithm of

ZWS05 (which uses piecewise cubics), PPM-H actually outperforms in the L1 norm though it performs more poorly in L2 and L∞. The reason for this poorer performance in the L2 and

L∞ norms is evident from the Lmax norm which shows the difference in maximums between the analytical solution and the numerical. The SLICE schemes are simply representing the middle peak more accurately than the hybrid scheme, but on the whole, the hybrid scheme is

calculating a more accurate solution denoted by the L1 norm which does not weight the larger error magnitudes more in the global error calculation. The next test case is also from ZWS05 in which a slotted cylinder undergoes solid body

rotation, and the parameters for this simulation are in the caption of Fig. 3.6. Table 3.9 gives the error norms for the methods used in this paper along with norms from ZWS05 for external comparison. What is strange is the fact that PPM seems to perform better than the cubics of both SLICE and SLICE-M. Using another external comparison, there is also a stark improve-

ment in using the PPM method coded in ZWS07, so this is not a unique finding. The most likely cause is the fact that cubics are more oscillatory than parabolas; and therefore, their lim- iting is likely to be more severe causing even the 6-cell monotonic post-processing limiter to still give poorer accuracy than with PPM. However, the cubics do give a better representation

118 1.0 1.0

0.8 0.8

0.6 0.6

Density

Density

0.4 0.4

0.2 0.2

70 70 60 60 50 50 0.0 40 0.0 40 30 y−axis 30 y−axis 10 10 20 30 40 20 30 40 x−axis x−axis

(a) PPM (b) PHM

1.0 1.0

0.8 0.8

0.6 0.6

Density Density

0.4 0.4

0.2 0.2 70 70 60 60 50 50 0.0 40 0.0 40 30 30 10 20 30 40 y−axis 10 20 30 40 y−axis x−axis x−axis

(c) PDHM (d) PRM

1.0

0.8

0.6

Density

0.4

0.2

70 60 50 0.0 40 30 10 20 30 40 y−axis x−axis

(e) PPM-H

Figure 3.6: Cosine cone solid body rotation. Surface plotted after 1 rotation, nx = ny = 101, 2 nt = 96, Ω =[0,100] , ωr = 2π/(nt∆t), ∆t = 1800s. The x- and y- axes have the units m.

119 Table 3.9: Error norms for slotted cylinder solid body rotation experiment. SLICE and SLICE- M are defined by ZWS05 Method L1 L2 L∞ Lmin Lmax PPM 0.1869 0.2311 0.6194 0.0000 0.0000 PHM 0.2516 0.2762 0.6398 0.0000 -0.0073 PDHM 0.2778 0.2882 0.6559 0.0000 0.0179 PRM 0.1977 0.2374 0.6245 0.0000 0.0000 PPM-H 0.1700 0.2224 0.5910 0.0000 0.0003 SLICE 0.2223 0.2391 0.5843 -0.1509 0.1354 SLICE-M 0.2133 0.2457 0.6369 0.0000 -0.0089 of the steep gradients (when no monotonic limiter is applied) as is evident by the superior L∞ norms which inevitably occur at these steep gradients. What is very nice to see here is that

PPM-H gives better accuracy than even the SLICE-M scheme in all error norms. The L∞ norm is better because the slope is held more sharply by PPM-H than SLICE-M. This is an especially significant result because PPM-H still only uses the same stencil as PPM. The PRM as usual gives comparable error, and PHM as seen visually does not handle the steep gradients very well. PDHM continues to show the most smoothing and the largest errors. The last test to be performed is the solid body rotation of the Leveque data followed by an examination of the shape preservation via contour plots. The Leveque experiment (Leveque,

2002) uses a domain of nx = ny = 80 grid cells with a grid spacing of ∆x = ∆y = 0.025m.

The number of time steps is nt = 75 at a time step value of ∆t = 1s and a rotation rate of 2 1 ωr = 2π∆t/nt 8.376 10 s . Fig. 3.7-3.9 show surface and contour plots of the Leveque ≈ × − − solutions after one full rotation. It might seem strange that Fig. 3.7 shows PPM as having overshoots (values greater than 1). However, observing that these values are not at the discon-

7 tinuities and that they’re very small, O 10− , investigation showed that they resulted from the CCS scheme in the process of solid body rotation and not directly from PPM. The new density after every time step is not just a product of the monotonic integration but also division by the new cell area. If the new cell area has errors, the new density may, in fact, violate mono-

120 tonicity to a small extent. PRM shares the same phenomenon near the center of the square block distribution revealing that this slight violation of monotonicity is rooted in the boundary approximations (which PPM and PRM share). This is not too much of a surprise because the boundaries are not fit to the actual cell means. They are fit to a modification, see (2.14) and

(2.15), which combined with possible small errors in dividing out the Eulerian arrival cell area has lead to small arithmetic errors. This effect is only seen in Cartesian solid-body rotation and not in the uniform advection. Typically, the Leveque data is tested with contours to judge the shape preservation from a plan view. It is clear that most methods preserve the shapes well with very little distortion being shown in the contour plots. PDHM smooths the gradient the most (as is usually the case) as can be seen by the contours farther apart from one another, and the distortion is greater with PHM and PDHM than the other methods. The contours of PDHM reveal that it has some problems

2 with overshoots which are O 10− . PRM, as seems to be the rule with fairly discontinuous data, gives a very close representation  to that of PPM. PPM-H diffuses the gradients the least as is especially evident in the contour plot of the cone in Fig. 3.9b as it resolves the cone’s height well above 90% and all of the other methods smooth it to below 90% of its original height. Table 3.10 gives the error norms for the Leveque data for intracomparison of this study’s methods. There is only modest improvement in PPM-H over PPM. PHM and PPM-H show interesting behavior in the Lmax norms (also present in Table 3.9 for the slotted cylinder) which reveals that the entire top of the square block has been decreased by at least 0.0019 for PHM and 0.0005 for PPM-H. This could only have been caused by the positive definite filter which adds mass to negative values and takes that mass away from the rest of the domain, weighted such that the largest values have the most mass taken away. Clearly, the PPM-H outperformed

PPM in every error norm. The Lmax norm reveals that PDHM overshoots are the worst of any of the methods.

121 1.0

0.8

Density 0.6

0.4

0.2

1.5 0.0 1.0 0.5 y−axis 0.5 0.0 0.5 1.0 1.5 2.0 1.0 1.5 x−axis 0.0 0.5 1.0 1.5 2.0

(a) Surface plot (b) Contour plot

Figure 3.7: PPM: 1 revolution of Leveque data solid body rotation. See text for experiment specifications.

1.0

0.8

Density 0.6

0.4

0.2

1.5 0.0 1.0 0.5 y−axis 0.5 0.0 0.5 1.0 1.5 2.0 1.0 1.5 x−axis 0.0 0.5 1.0 1.5 2.0

(a) Surface plot (b) Contour plot

Figure 3.8: PDHM: 1 revolution of Leveque data solid body rotation. See text for experiment specifications.

122 1.0

0.8

Density 0.6

0.4

0.2

1.5 0.0 1.0 0.5 y−axis 0.5 0.0 0.5 1.0 1.5 2.0 1.0 1.5 x−axis 0.0 0.5 1.0 1.5 2.0

(a) Surface plot (b) Contour plot

Figure 3.9: PPM-H: 1 revolution of Leveque data solid body rotation. See text for experiment specifications.

Table 3.10: Error norms for Leveque data solid body rotation experiment. Method L1 L2 L∞ Lmin Lmax PPM 0.0929 0.1711 0.5786 0.0000 0.0000 PHM 0.1254 0.2061 0.6154 0.0000 -0.0019 PDHM 0.1418 0.2161 0.6235 0.0000 0.0139 PRM 0.0985 0.1759 0.5862 0.0000 0.0000 PPM-H 0.0847 0.1640 0.5651 0.0000 -0.0005

123 3.1.3 2-D Spherical

This study has found that in spherical geometry, the present model is not advecting data over the poles as accurately as the implementation in Nair et al. (2002). Because this polar grid cells are very small in area, the errors at the poles do not greatly affect the L1 and L2 errors but do affect the L∞ errors quite a bit because the L∞ error is not in mass space but in density space such that small surface area no longer weights the quantity to a lesser value. Therefore, external comparison with other published results is only made in the event of an extreme gain in accuracy, and internal comparison (with the present study’s model only) is the main method of relative evaluation of the new schemes. This difference at the poles should be kept in mind throughout this section. The first test case to be performed in spherical geometry is the solid body rotation of a cosine hill across the poles (αr = π/2) on a grid of nλ = 128 by nµ = 65 grid points.

The revolution takes place over 12 days with nt = 256 time steps yielding a time step of

∆t = 12(24)(3600)/256 = 4050s and an angular velocity of ωr = 2π/(12 24 3600) × × ≈ 6.06 10 6s 1. This yields a meridional Courant number of about Cθ = 0.5. Because the × − − present model does not take into account the adaptations of Nair (2004), the meridional Courant

number must be restricted to be less than unity. Fig. 3.10 shows contour plots of the results of the methods in this study. Clearly, PPM- H keeps the magnitude of the cosine hill more accurately than does PPM. All methods show some distortion at the bottom of the curve which is due to the inevitable polar distortion. As expected from the Cartesian experiments, PHM and PDHM perform the poorest for this quasi- smooth function. Table 3.11 gives the error norms for each of the methods in this experiment along with the values from SLICE-S of Zerroukat et al. (2004) for some external comparison. It should be noted before comparison that it is well known and quite evident in literature that the monotonic filter greatly increases the L∞ error which occurs in this case at the cosine hill’s

124 (a) PPM (b) PHM

(c) PDHM (d) PRM

(e) PPM-H

Figure 3.10: Contour plots of the results of spherical polar (αr = π/2 0.05) advection of a cosine hill over the sphere. −

125 Table 3.11: Error norms for the spherical cosine hill solid body rotation experiment. Method L1 L2 L∞ Lmin Lmax PPM 0.0808 0.0867 0.1486 0.0000 -0.1486 PHM 0.1936 0.1723 0.2025 0.0000 -0.2000 PDHM 0.2394 0.1997 0.2372 0.0000 -0.2372 PRM 0.1003 0.0988 0.1632 0.0000 -0.1631 PPM-H 0.0534 0.0625 0.1130 0.0000 -0.1129 SLICE-S 0.0790 0.0490 0.0420 — —

center due to the clipping of extrema. Monotonic filtering also increases the L1 error by ne- cessity because the L1 error linearly judges the overall error. Even with the increased L∞ norm due to monotonic limiting, the PPM-H scheme still outperforms the cubics used in SLICE-S

(which is not monotonically limited) on the sphere in the L1 norm. This is a significant finding which shows that in a spherical implementation environment, PPM-H offers a great advantage over the previous methods and is competitive among the other methods used in conservative

SL transport. Clearly, because the L∞ norm is so much greater with monotonic limiting, the L2 norm is quadratically increased to a value greater than SLICE-S. In order to gain a feel for how the error is varying as the cosine hill advects over the

poles for the different methods, Fig. 3.11 shows the L1 and L∞ error norms evolving in time throughout the simulation for PPM and PPM-H. There is a very evident and fairly structured 2∆t oscillation in the error norms as they evolve through time similar in structure to what one would see in a leapfrog time integration. The amplitude of this oscillation is much larger with

L∞ which certainly shows that the oscillation is only occurring in a relatively few number of

grid points; namely, the grid points with the largest error magnitudes. Because the oscillation occurs the entire time, it must be present in the cosine hill itself; and tracking the maximum

error via the Fortran maxloc function, it was found that the L∞ error generally comes from the cosine hill center. This means that this phenomenon is closely tied to the existence of the smooth maximum at the cosine hill center. Most likely, this is simply an effect of having a

126 meridional Courant number close to 0.5 which means that every other time step, the center of the cosine hill is over a boundary; and on opposing time steps, the center is over a cell center. Thus, the resultant piecewise constant representation at each time step has an alternating higher and lower value.

It is very common to test the spherical cosine hill advection in a quasi-polar fashion mean- ing that αr = π/2 0.05 to place the center of the cosine hill slightly offset of polar advection. − This will test the distortion of the poles on an asymmetric pass by the poles. Fig. 3.12 shows the results of this simulation with the same parameters as before except with a modified αr. Clearly, the results are no longer symmetric in the longitudinal direction as before, but the

distortion is still very small, and the errors are, in fact, better than in the full polar case.

The next test case is a quasi-polar deformational flow such that αr = π/2.2 and λp =

π + 0.025 meaning that the rotated sphere’s pole is located at about λp = 181.43, θp = 81.82. This is to make sure there is no symmetry in the deformational flow and cause the errors to be

a little higher. Fig. 3.13 shows a plot of the analytical solution to the smooth deformational flow with a plot of the PPM-H solution as well. Comparison is not made between different methods visually as it is very difficult to view the differences. For this reason, qualitative comparison is made visually to inspect the general characteristics of the errors and quantitative

comparison is made via error norms only in Table 3.12. What is most immediately noticeable are the problems at the polar regions evident in a visual comparison between Fig. 3.13a and Fig. 3.13b. This propagates out into the solution as a whole but only to a small extent because the surface area is so small and the error contribution is therefore relatively small. Other than that, the error seems to manifest mainly in that the fluid is not rotated as much near to the poles

in the numerical solution as it is in the analytical solution. Judging from the error norms in Table 3.12, it is clear that PPM-H offers little relative ad- vantage over PPM in this context with only around an 8% increase in accuracy. If the poles are

not included in the L∞ calculation, they decrease down to about 2.9 rather than their current

127 0.16 L-inf L-1

0.14

0.12

0.1

0.08 Error

0.06

0.04

0.02

0 1 2 3 4 5 6 7 8 9 10 11 12 Time (days) (a) PPM

0.16 L-inf L-1

0.14

0.12

0.1

0.08 Error

0.06

0.04

0.02

0 1 2 3 4 5 6 7 8 9 10 11 12 Time (days) (b) PPM-H

Figure 3.11: L1 and L∞ norm plots for PPM and PPM-H polar solid-body rotation of a cosine hill on the sphere.

128 (a) PPM (b) PHM

(c) PDHM (d) PRM

(e) PPM-H

Figure 3.12: Contour plots of the results of spherical quasi-polar (αr = π/2 0.05) advection of a cosine hill over the sphere. −

129 (a) Analytical

(b) PPM-H

Figure 3.13: Contours of smooth quasi-polar deformational flow experiment.

130 Table 3.12: Error norms for the smooth deformational flow experiment. Method L1 L2 L∞ Lmin Lmax PPM 1.169 10 3 3.719 10 3 4.223 10 2 -1.438 10 5 1.591 10 5 × − × − × − × − × − PHM 1.201 10 3 3.899 10 3 4.553 10 2 -1.048 10 4 1.049 10 4 × − × − × − × − × − PDHM 1.297 10 3 4.318 10 3 4.524 10 2 -8.927 10 7 -3.814 10 7 × − × − × − × − × − PRM 1.129 10 3 3.643 10 3 4.259 10 2 -3.290 10 6 -5.446 10 6 × − × − × − × − × − PPM-H 1.083 10 3 3.407 10 3 4.219 10 2 -1.438 10 5 1.591 10 5 × − × − × − × − × − values of 4.2 to 4.5 which shows that clearly there is a deficiency in the polar cap interpola- tions relative to the Nair et al. (2002). One important thing to note is that there is actually a slight increase in accuracy when using PRM relative to PPM which is not usually the case.

The positive Lmax norms and negative Lmin norms show the maximum overshoot and under- shoot magnitudes. For PPM, PHM, and PPM-H, it is clear that they are roughly symmetric in magnitude while with PDHM and PRM there are undershoots but no overshoots.

131 Chapter 4

Conclusions and Future Work

Four new conservative non-polynomial based sub-grid approximations have been adapted from existing literature to the CISL framework and tested along with a hybrid method called PPM- Hybrid (PPM-H). These sub-grid approximations are the Piecewise Hyperbolic Method (PHM), the Piecewise Double Hyperbolic Method (PDHM), the Piecewise Double Logarighmic Method (PDLM), and the Piecewise Rational Method (PRM). In the PPM-H scheme, PPM is used in general and PHM is used at steep jumps where PPM experiences accuracy degeneration. The CISL method is extended into 2-D via the Conservative Cascade Scheme (CCS) and further brought into spherical geometry via the (λ, µ) coordinate transformation and special treat- ments at the polar caps. It was found that PDLM experiences large truncation errors when the grid is refined a small grid spacing, and therefore, the scheme is not used outside the 1-D context. Generally, PHM, PDHM, and PRM offered no robust advantage over the current most common method: the Piecewise Parabolic Method (PPM). For very smooth data, however, PHM can outperform PPM with almost 5 times less error in some cases. This is not a robust result though because PHM has twice the error of PPM for steep jumps. The PPM-H method provided a significant improvement over PPM in all cases, even spherical cases. In fact, for polar solid body rotation of a cosine hill on the sphere, PPM-H outperformed even the SLICE-

132 S scheme in the L1 norm which uses piecewise cubics for the subgrid distribution. Even for non-smooth data such as the slotted cylinder in 2-D Cartesian, PPM-H outperformed PPM in all error norms and remained monotonic.

It should be noted also that because (λ, µ) grid is stretched in the meridional direction, the PHM and PDHM reconstructions are not performing at full accuracy because the derivatives are being approximated at the geometric middle of two cell centers and the actual interface is offset. There are future plans to test the effect in spherical geometry of taking a truly higher- order approximation to the lateral derivatives in the meridional direction. This could be most easily accomplished by fitting a monotone Hermite cubic polynomial to the four cell means surrounding a particular boundary and then calculating the derivative of the approximated cubic to approximate the true interface derivative. This is indeed computationally expensive, but it may yield superior accuracy since the order of the approximation itself is due to the order of approximation of the lateral derivatives. The use of the monotonic Hermite function should yield some interesting results as a simple cubic Lagrange has already been tested in this context, and it yielded wildly oscillating results. Hopefully the monotonic limiting would reduce the magnitude of the derivatives and provide a stable and accurate approximation. Also, in the future, the polar cubic interpolations will be performed on polar tangent planes to increase the accuracy of the interpolation since it is currently being performed in (λ, µ) space.

133 Chapter 5

Further Applications

Throughout this study, only one application of the conservative remapping improvements via the Conservative Cascade Scheme (CCS) have been used: namely, conservative semi-Lagrangian advection. Since remapping in and of itself is such a general concept, it can be applied also to conservative interpolation algorithms and to remapping of multiple model levels in a fluid model with Lagrangian vertical coordinates.

5.1 Application to Conservative Interpolation

This section is devoted to the explanation and testing of a conservative grid-to-grid interpo- lation procedure based on the non-polynomial approximations and the hybrid approximation. For simplicity of development, assumptions were made that the domain of interpolation must be in spherical geometry with coordinates that are both regular and rectangular in (λ, µ) space. Furthermore, it is assumed that no cell may include a polar point within its domain. If the input grid does indeed include a pole point, the domain is truncated to the neighboring equatorward cell. For true global interpolation to be performed wrapping cyclically in the zonal direction and with a π-shift wrapping at the poles, data must have the meridional domain boundaries

134 directly on the pole points and the zonal domain boundaries matching with a 2π offset on the initial boundary. Otherwise, the domain boundaries are truncated and considered to be rigid, and a two-cell (using input cell grid spacing) buffer zone is placed between the input domain boundaries and the output domain boundaries to make sure the 4-cell stencils of PPM, PRM,

and PPM-H can be implemented at full accuracy for a meaningful comparison between the newly introduced methods.

5.1.1 The Conservative Interpolation Procedure

To give an example of conservatively interpolating coarse data down to a finer mesh, consider

Fig. 5.1. Fig. 5.1a shows an example of halving the grid spacing of the data with the input grid in black and the output grid in red. Similar to the CCS process, before a meridional sweep is performed, the scalars are all multiplied by the input zonal grid spacing such that when integrating in the meridional direction, true mass is indeed computed. Next, for each column, approximations are fit to the input cell values (schematically represented by Fig. 5.1b), and

integration is performed up to each intermediate boundary in order to compute the mass within the intermediate cells (represented by Fig. 5.1c). The intermediate boundaries are considered to simply be the µ values of the output grid. There may be multiple intermediate boundaries per input cell or multiple input cells between intermediate boundaries depending upon whether the meridional grid spacing is being increased or decreased. The intermediate cell masses are then divided by the input zonal grid spacing and the output meridional grid spacing to obtain the intermediate scalars. Supposing the input grid has Mλ Mµ grid cells and the output grid × has Nλ Nµ grid cells, the intermediate grid will have Mλ Nµ grid cells. × × Now that the intermediate mesh scalar values have been obtained, the scalars are multiplied by the output meridional grid spacing such that when integrating in the zonal direction, true mass will be obtained. Next, approximations are fit to the intermediate cells in the zonal

135 (a) Input grid in thick black lines and output grid in dashed red lines

(b) Depicts meridianal approxima- (c) Depicts meridianal integration tion fitting to input grid data up to the intermediate boundaries

(d) Depicts zonal approximation fit- (e) Depicts zonal integration up to ting to intermediate grid data target grid boundaries

Figure 5.1: Schematics demonstrating the process of conservative interpolation on regular rect- angular grids using the CCS. Gray shading denotes the fitting of approximations to data ori- ented along black arrows. Pink shading denotes the integration over approximations oriented along red arrows.

136 orientation (schematically represented by Fig. 5.1d), and integration is performed up to each target boundary in order to compute the mass within each of the target grid cells (represented by Fig. 5.1e). Finally, the mass values within the target grid cells are divided by the target grids zonal and meridional grid spacing, and a final scalar value is obtained conservatively on the target grid.

5.1.2 Sine Wave Test Case

In order to gain insight into how well the data is being interpolated, a coarse grid resolution will be refined with the higher-order conservative interpolation using smooth data. The interpolated data will then be compared against analytically calculated values to obtain an error norm for the interpolation and compare the methods. Smooth, cyclic data is desirable in this context, and in order to obtain this, we can assume that the analytical density distribution is a 2-D sine wave in (λ,θ)-space as follows:

ρ (λ,θ)= sin(kλ)sin(lθ) (5.1)

In this formulation, the only constraints are that k and l both be integer quantities because the domain has a zonal length of 2π and a meridional length of π. This ensures that the

distribution is spatially cyclic and thus smooth across the boundaries. When transferring this continuous distribution to a discrete representation, one cannot simply place the value at the cell center as the cell mean because that value is not generally the integrated mean across the cell area. Therefore, the function must be integrated over the area of the cell to obtain mass,

and that mass must then be divided by the cell area to obtain the true integrated mean of the

analytical function within the cell. The spherical integrated mass for a cell centered at λi, µ j can be expressed as 

137 λ θ i+1/2 j+1/2 Mi j = f (λ,θ)cos(θ)dθ dλ (5.2) λ θ i 1/2 j 1/2 Z − "Z − # λ θ λ µ where i 1/2 and i 1/2 represent the boundaries of the cell centered at i, j . Furthermore, ± ± because ρ (λ,θ)= g(λ)h(θ), this can be simplified to 

λ θ i+1/2 j+1/2 Mi j = sin(kλ)dλ sin(lθ)cos(θ)dθ . (5.3) λ θ i 1/2 j 1/2 "Z − #"Z − # Applying the trigonometric identity, sin(a)cos(b)=[sin(a b)+ sin(a + b)]/2, this simpli- − fies even farther to

λ θ θ 1 i+1/2 j+1/2 j+1/2 Mi j = sin(kλ)dλ sin[θ (l 1)]dθ + sin[θ (l + 1)]dθ (5.4) λ θ θ 2 i 1/2 j 1/2 − j 1/2 "Z − #"Z − Z − # which is trivially integrated analytically to obtain the analytical mass values:

θ θ cos[ j 1 2(l 1)] cos[ j 1 2(l 1)] + / − − / − l 1 − λ λ − cos k i+1/2 cos k i 1/2   M = − − (5.5) i j 2k    θ θ  cos[ j 1 2(l+1)] cos[ j 1 2(l+1)]  + / − /   + l+1 −      Once the mass is calculated, the analytical density for a given cell is the ratio of the cell’s mass

to the cell’s surface area, Ai j = ∆λi∆µ j, where again µ = sinθ:

Mi j ρ¯i j = (5.6) ∆λi∆µ j

Fig. 5.2 shows three plots of this data profile: one with k = 1, l = 2 spanning one wave- length in each direction, another with k = 3, l = 6 spanning three wavelengths in each direction,

138 and another with k = 9, l = 18 spanning nine wavelengths in each direction. Because spanning one wavelength across the globe is invariably smoother than spanning multiple wavelengths across the globe in a discrete framework, these three will be tested to gain an idea of how the interpolation accuracy responds to this change in smoothness. A set of test cases will include the problem of conservatively interpolating regular 1o data down to a coarser 2.8o grid which implies a loss of information in the interpolation. An interpolation up to a finer grid will not be tested because such a process would need to arbitrarily “predict” the subgrid variation in a one step situation which is not realistic in practice. In reality, a statistical or dynamical down- scaling of data is performed and not a simplistic interpolation because the characteristics of the resulting finer grid data should not be so dependent upon the subgrid functional approximations used.

5.1.3 Results and Conclusions

Performing the three test cases mentioned in section 5.1.2, Table 5.1 provides a summary of the results in terms of the L1 error norm across the global domain. What may be surprising here is that the method that outperformed all of the other methods, PPM-H, turns out to be the worst method for all test cases. This is not contradictory though because conservative interpolation is a very different application than semi-Lagrangian transport because the mass is not brought

back to the arrival grid to be divided by arrival cell densities, and only one step is performed in this process. Therefore, order of convergence isn’t necessarily that which dictates accuracy in this case but rather the actual accuracy of matching the true inner cell distribution for this one step of remapping. Seemingly, application of the non-polynomial subgrid approximations

is very promising for conservative interpolation when remapping to a coarser grid because all

of the non-polynomial methods improved upon PPM in the L1 norm. In fact, PDHM which performed the worst in the SL transport application is performing the best overall in this finite

139 (a) k = 1, l = 2: 1 wavelength in each direction (b) k = 3, l = 6: 3 wavelengths in each direction

(c) k = 9, l = 18: 9 wavelengths in each direction

Figure 5.2: Contour plots of sine wave data on the sphere. Both plots are from the same perspective, and the integrated mean densities were calculated on a grid with 64 cells in the meridional direction and 128 cells in the zonal direction.

140 Table 5.1: Tabulation of the L1 error norms for conservative interpolation of three sets of data (Nλ =1, 3, & 9) for the five methods of this study. Nλ represents the number of wavelengths spanned zonally and meridionally across the global domain. Method Nλ = 1 Nλ = 3 Nλ = 9 PPM 6.50E-5 1.90E-4 1.34E-3 PPM-H 1.43E-4 3.77E-4 1.44E-3 PHM 5.65E-6 3.49E-5 1.01E-3 PDHM 9.30E-8 5.22E-6 9.22E-4 PRM 7.52E-8 1.55E-5 1.03E-3 volume cascade interpolation algorithm. Even as the data becomes less and less smooth in the discrete domain, PDHM remains somewhat more accurate than the other methods.

141 Part II Investigating Higher-Order Semi-Implicit Semi-Lagrangian Temporal Accuracy Chapter 6

Introduction

6.1 Semi-Implicit Semi-Lagrangian (SISL) Methods

See section 1.1 for a review of general semi-Lagrangian methods. Although semi-Lagrangian methods avoid the time step restrictions due to the CFL limit for advection, when applied to a nonlinear set of dynamical equations such as the shallow water equations, Euler equations, or Navier-Stokes equations, faster modes propagate through the fluid such as gravitation os- cillations and acoustic (compressional) waves. In such regimes, the semi-Lagrangian method offers no stability advantage because waves much faster than atmospheric advection limit the time step. Robert et al. (1972) introduced at semi-implicit handling of the terms responsible for gravity waves in the shallow water equations in the Eulerian advection framework, and this was shown to allow the model to run stably at any time step with regard to gravity wave motions being limited only by advection. Later, combining the two methods, (Robert, 1981;Robert, 1982) implemented the semi-implicit semi-Lagrangian (SISL) method for the meteorological equations which are theoretically unconditionally stable in time. With this SISL method the temporal truncation error, being limited now by accuracy rather than stability, may be allowed to increase to approximately that of the spatial truncation error, and it has demonstrated a great

143 improvement in efficiency. Even though the SISL models are unconditionally stable in theory, it has often been found that certain forcings may excite gravity waves with a period of oscillation which is very close to the time step (which has been extended). In these cases, a resonance occurs between the gravity wave and the numerical scheme in which non-physical computational modes of the solution are excited and increase unstably. Particularly, bottom orographic forcing is known to excite this phenomenon (Staniforth and Cote, 1991; Nair and Tufo, 2007; Cote et al., 1998; Machenhauer and Olk, 1996; Eliassen and Raustein, 1968). The most common solution to this problem is to employ what is known as trajectory uncentering (Cote et al., 1998; Semazzi et al.,

2005; Lauritzen and Nair, 2007) in which the linear center in time of the coefficients applied to the forcing terms are off-centered from the linear center in time of the LHS by some prescribed amount. In a first-order context (that is, employing two time levels in the uncentering), this simply implies placing more weight on the forcing at the future time step and less weight on the current time step when evaluating the time tendency of a dependent variable.

6.2 Examining SISL performance for Gravity Waves

In Semazzi et al. (2005), SISL simulation of internal gravity wave (IGW) motion is analyzed in hydrostatic and non-hydrostatic regimes to study the performance of SISL methods in simulat- ing smaller-scale phenomena. The reasoning for studying IGW motion is that as computational resources increase and model resolution also increases, it is beneficial to know how well a given method performs in the presence of smaller-scale phenomena. In the global coarse resolution models, gravity wave solutions are considered noise in comparison to the larger-scale phenom- ena. However, in smaller-scale simulations, they play integral roles in forcing weather events and interacting with the mean flows. It was found using a stretched-grid SISL dynamical core (Lauritzen, 2007; Lin, 2004;

144 Machenhaur et al., 2007) that non-hydrostatic simulations of idealized terrain-forced IGW motion perform more accurately at larger Courant numbers than do their hydrostatic coun- terparts. Particularly, the phase truncation errors dominate the hydrostatic solution. Following the full dynamical core experiments, a linear 2-D model was derived based on the atmospheric equations of motion. Assuming harmonic input of the dependent variables using length scales typical of IGW motion (O (1 to 10 km)). As noted in Holton (2004), the vertical scale of of atmospheric motion for IGW must be less than the vertical atmospheric scale height (about 8km) for the Boussinesq approximation to be well-posed. It was again confirmed that non- hydrostatic simulations yield more accurate results both in phase and in amplitude than the hydrostatic simulations. In particular, the linear model solution gave valuable insight into the asymptotic behavior of the physical and computational modes. It was found that as ∆t 0, the amplitude of the → computational mode tends to zero while amplitude of the physical mode tends to unity (an expected and positive result). Also, as time step tends to zero, the computational frequency tends to positive and negative infinity (for eastward and westward propagating gravity waves, respectively) while the physical frequency tends to the analytical value. This means that as the time step increases, the computational modes both increase in magnitude and begin to obtain a frequency similar to the physical modes which would hinder the solution. The sum of the eastward propagating physical and computational modes would render the eastward gravity wave solution in the scheme (likewise for westward) used in Semazzi et al. (2005). The concluding remarks stated that the growing amplitude of the computational mode may, in fact, act to offset the damping in the physical mode as time step increases. However, the increase of computational mode amplitude with time step was observed to be linear while the decrease of physical mode amplitude with time step was observed to be superlinear. Also in the conclusions, it was stated that the sharp degradation in accuracy at higher Courant numbers strongly implies that a temporally higher-order method may be necessary in order to overcome

145 this. The purpose of this section is to investigate a class of two-time step methods known as Adams-Moulton (AM) methods which can obtain arbitrarily high-order accuracy as a greater number of forcing time steps are included. Several modifications and deviations will be made relative to the Semazzi et al. (2005) study, hereafter SSP05. First, the SSP05 derivation includes spatial error introduced by the second-order centered derivative approximations in the pressure gradient terms and mass continuity equation. A method is devised in this study to remove this error by calculating the limit of the terms contain- ing spatial grid spacing terms as the grid spacing is brought to zero. Experimentation showed that this change had virtually no affect on the solution for all AM methods tested in this study; therefore, spatial error will remain removed throughout all of the simulations. Also, trajectory uncentering will initially be removed from the problem and reintroduced later using a different mechanism of performing uncentering than that of SSP05. This is done because AM methods obtain accuracy of mth-order by using predetermined coefficients on m forcing terms which in and of itself precludes the use of traditional uncentering (which is introduced by altering the forcing coefficients). A method of removing the Boussinesq approximation in SSP05 has been constructed by prescribing an exponential function for the vertical variation of density such that the fractional vertical gradient of density becomes a constant which is easily incorporated into the linear solutions. The relaxation of the Boussinesq approximation was experimentally determined to have virtually no affect similar to the removal spatial error in the derivative ap- proximations. Finally, the sensitivities of the SISL solution to mean state temperature and Brunt-Vaisala frequency are analyzed.

Though the comparative performance of hydrostatic and non-hydrostatic regimes of sim- ulation will be tested in this study, it is not the focus as in SSP05. The focus of this study is the change in performance of the SISL method for IGW motion as the order of convergence is increased from first- to third-order convergence. This is in response to the concluding re-

146 mark that higher-order temporal accuracy may be necessary to sustain larger time steps and still simulate local scale effects of atmospheric motion such as IGWs.

147 Chapter 7

Methodology

7.1 Model Equations

The set of atmospheric dynamic equations of SSP05 (similar to those of Qian and Kasahara, 2003) are brought into two dimensions by neglecting the meridional gradients and wind. Then, they are linearized by first assuming all dependent variables consist of a background and per- turbation flow: ψT = ψ¯ + ψ where ψT is the total quantity of an arbitrary dependent variable, ψ¯ is the background flow, and ψ is the perturbation. Next, all products including more than one perturbations are removed to give linearity to the system under the assumption that the per- turbations are much smaller in magnitude than the background flows. The vertical background

flow is assumed to be negligible compared to the zonal background flow and therefore,w ¯ is removed from the equations. Also, Coriolis force (due to 2-D approximation) and curvature may be neglected. The conservation of zonal velocity, meridional velocity, vertical velocity, energy, and mass equations (respectively) are therefore, given as follows with Lagrangian time derivatives:

Du 1 ∂φ = (7.1) Dt −ρ¯ ∂x

148 Dw 1 ∂φ ρ δ = g (7.2) Dt −ρ¯ ∂z − ρ¯  

Dρ ρ¯Λ2w = (7.3) Dt g

∂u ∂w Γ ∂ρ¯ + = w (7.4) ∂x ∂z −ρ¯ ∂z

It should first be noted that φ is not representative of geopotential height as in the usual convention but rather represents pressure. This is indicated mathematically as it is scaled by inversed density. The term δ is given as a non-hydrostatic switch in the model such that

δ = 1 renders a non-hydrostatic solution and δ = 0 renders a hydrostatic solution. Also, the stability-related parameter, Λ, is the Brunt-Vaisala frequency which is given the value of Λ ≈ 1 0.012s− which is a buoyancy period of nearly 8 minutes for a tropospheric integrated average approximation.

The Γ parameter is another flag which invokes the Boussinesq approximation if set to zero; otherwise, it is set to unity. Note that the derivation of this form of the continuity equation comes from the fact that the fully compressible continuity equation states:

∂ρ ∂u ∂ρ ∂w ∂ρ = ∇ ρ~V = ρ + u + ρ + w (7.5) ∂t · ∂x ∂x ∂z ∂z  

In the approximation used here, the equations are still incompressible because density is not allowed to evolve in time in the continuity equation which excludes the existence of acoustic

(compressional) waves. In the approximation here, we assume incompressibility (meaning all density terms are basic state density denoted by ρ¯), negligible variations of density in the horizontal, and the possible inclusion of vertical variation of density (using the Γ flag shown

149 earlier): ∂u ∂w Γ ∂ρ¯ + = w (7.6) ∂x ∂z −ρ¯ ∂z

Therefore, when it is mentioned later that the Boussinesq approximation is being relaxed, this is strictly speaking of the relaxation in the vertical mean state and not in the horizontal.

7.2 The Semi-Implicit Semi-Lagrangian Discretizations

Before going over the different temporal discretizations, let us first assume a generic dependent variable, ψ, which is defined discretely on a grid at zonal and vertical incides of I and K respectively where the location of ψI,K would be located at (I∆x,K∆z) and ∆x and ∆z are the zonal and vertical grid spacing respectively. Spatial indices will be denoted via subscripts. Let us also assume a time index N such that ψN is defined at the time N∆t where ∆t is the time

step. Also, let us assume generic hyperbolic forcing F (ψ,~x,t) where~x represents the position vector. The generic conservation laws may all be written, therefore, in the form:

∂ψ = F (ψ,~x,t) (7.7) ∂t such that at most a first-order derivative may be taken on the Right Hand Side (RHS).

7.2.1 Two-Time Step Methods

The term two-time-step refers to the range of time indices used in the Left-Hand Side (LHS) of equation (7.7). All of the methods in this subsection will require only two time indices (namely

N and N + 1) in the LHS with any arbitrary number of forcing time indices used in the RHS. This class of methods includes several subclasses including but not limited to implicit Runge- Kutta (RK) methods and Adams-Moulton (AM) methods (the implicit form of a technique virtually identical to Adams-Bashforth methods). For implicit RK methods, the coefficients

150 applied to the intermediate estimates are problem-specific and need to be solved for iteratively. Therefore, in practice, they are not used because they likely constitute more of a cost than the improved convergence they offer. AM methods are somewhat more attractive because there is no need to iteratively solve for coefficients, but rather the coefficients are predetermined and independent of the problem. Basically, AM methods linearly combine the previous time steps of information in order to determine the value at the future time steps. The order of convergence is produced by the coefficients applied to the forcing terms. The downside to AM methods is that for an order of convergence, n, they require precisely n time indices of information for the forcing terms.

First, this requires more demand on fast memory during computation which can be even time consuming if it causes the model to exceed the availability of true physical memory requiring frame switches. Second, more time indices of forcing means that more backward trajecto- ries will be required which will produce additional overhead. Third, AM methods span their forcing terms over a longer time period which typically means that though a higher order of convergence is reached, the constant multiplied by that O (∆tn) error term is usually larger than if it were done over only one time step as in the RK methods. Therefore, AM methods are the only methods studied in this subsection according to the following form: ψN+1 ψN m I,K − I,K α N j+2 ∆ = ∑ m j+1FI,K− (7.8) t j=1 − A general solution is derived such that there may exist up to three forcing terms on the RHS

(m 3). The coefficients may be chosen as desired for any particular scheme. For instance, if ≤ we desire a second-order solution, then we simply set m = 2 such that α3 = 0. The AM scheme fits a Lagrange polynomial to the forcing terms and integrates over the polynomial to obtain the forcing over the time period from step N to time step N + 1. The coefficients, αi for an

151 Table 7.1: Forcing coefficients for Adams-Moulton schemes of first- through third-order as applied to (7.8). AM Order α1 α2 α3 First 1 0 0 Second 1/2 1/2 0 Third 5/12 2/3 1/12 − mth-order AM method (thus m forcing terms) are given as:

( 1) j 1 m αm j = − ∏ (u + i 1) du (7.9) − j!(m j)! 0 "i=0,i= j − # − Z 6

It turns out that to set up a discretization with two time levels of ψ on the LHS and m time levels of forcing on the RHS can be used for any scheme of order O (∆t p) : p m . { ≤ } Therefore, we limit the feasible order of temporal convergence to a value of m = 3 allowing 3 forcing terms with general coefficients on the RHS in the discretization. This renders (7.8) to a less generic form: ψN+1 ψN I K I,K − , = α FN+1 + α FN + α FN 1 (7.10) ∆t 1 I,K 2 I,K 3 I,K−

Table 7.1 gives values for the αi parameters for first- through fifth-order AM schemes as applied to (7.10) for the solution of a generic conservation equation. Therefore, (7.1- 7.3) may be temporally discretized as follows:

uN+1 uN α ∂φ N+1 α ∂φ N α ∂φ N 1 I,K − ,K 1 2 3 − ∆ ∗ = ρ ∂ ρ ∂ ρ ∂ (7.11) t − ¯ x I,K − ¯ x ,K − ¯ x ,K    ∗  ∗∗

wN+1 wN α ∂φ N+1 α ∂φ N α ∂φ N 1 I,K − ,K δ 1 ρN+1 2 ρN 3 − ρN 1 ∆ ∗ = ρ ∂ g I,K ρ ∂ g ,K ρ ∂ g −,K (7.12) t − ¯ " z I,K − # − ¯ " z ,K − ∗ # − ¯ " z ,K − ∗∗ #    ∗  ∗∗ ρN+1 ρN ρ¯Λ2 I,K ,K α N+1 α N α N 1 − ∗ = 1wI,K + 2w ,K + 3w −,K (7.13) ∆t g ∗ ∗∗   It should be noted that ,K represents the departure locations at time step (N) traced upstream ∗

152 from the regular grid defined at (N + 1). Likewise, ,K represents the departure locations ∗∗ at time step (N 1). K does not change when traced upstream in this model because the − assumptionw ¯ = 0 removes vertical advection. Next, the upstream departure locations are assumed to be interpolated by a linearly weighted sum of regular grid point values as follows:

ψN = ∑ C ψN (7.14) ,K r∗ I P∗ r,K ∗ r − − 

N 1 N 1 ψ − = ∑ C∗∗ψ − (7.15) ,K r I P∗∗ r,K ∗∗ r − −   (i) where Cr is the weight and r iterates over r = 1,0,1,2 for a linear to cubic interpolation − based upon the weightings. Additionally, a centered second-order approximation is applied to the derivative terms. Next, a harmonic assumption is made following Gill (1980) as follows:

N+1 (u,w) (uˆ,wˆ)ei[Ik∆x+K(m iq)∆z] λ N+1 − = SISL . (7.16)  ×  i Ik∆x K m iq ∆z (ρ,φ)   ρˆ,φˆ e [ + ( + ) ] I,K     The basis of this particular assumption is that the gravity waves are given a horizontally har- monic solution with a vertical solution similar to Taylor (1931) and Goldstein (1931) (see Gill, ω ∆ λ N i SISLN t 1980 for details). It is assumed that SISL = e− giving the solution a propagation in time. Negative propagations are denoted by a negative frequency, and wave numbers are al- ways assumed to be positive in this defined context. The parameter q is related to inverse of the

scale height of an isothermal atmosphere with the given definition: q = 1/(2Hs)= g/(2RT0)

where R is the dry air constant and T0 is the isothermal temperature. For the actual value of

T0, see section 7.4. From here, the derivation of the solutions to this system of equations is completely identical to that in Semazzi et al. (2005) , hereafter SSP05, with the only exception

that the forcing coefficients (α1, α2, α3) as given here are given the notation (α,β,γ) in SSP05. To briefly summarize the solution process, a homogeneous system, Ax = 0, is set up where

153 xT = uˆ,wˆ,ρˆ,φˆ . To obtain non-trivial solutions to the problem (meaning x = 0), we simply 6 set det(A)= 0 and obtain a quartic polynomial equation in terms of λSISL expressed as:

4 µ λ i ∑ i SISL = 0 (7.17) i=0 

2 2 2 µ4 = δ +(∆t) Λ α (τ + iη) 1 − 2 2 µ3 = 2(∆t) Λ α1α2Θ 2Θδ + 2Θ(τ + iη) − 2 2 2 2 2 2 µ2 = δΘ +(∆t) Λ α Θ + 2α1α3Ω Θ (τ + iη) (7.18) 2 − 2 2 µ1 = 2(∆t) Λ α3α2ΘΩ  µ ∆ 2 Λ2α2Ω2 0 =( t) 3 where as described in SSP05,

sin2 (m∆z)cosh2 (q∆z)+ cos2 (m∆z)sinh2 (q∆z) ∆x 2 τ = (7.19) − sin2 (k∆x) ∆z  

2 Γ ∂ρ¯ sin[(m + iq)∆z] sin(k∆x) − η = (7.20) ρ¯ ∂z ∆z ∆x   ∆ Θ ik(P∗+r) x = ∑Cr∗e− (7.21) r ∆ Ω ik(P∗∗+r) x = ∑Cr∗∗e− (7.22) r The solutions to this quartic equation are reconstructed from the complex roots of (7.17).

iω∆t Because we assumed λSISL = e− , the resulting amplitude after one time step is given as:

ˆ iω∆t λ = e− = cos( ω∆t)+ isin( ω∆t) = cos(ω∆t) isin(ω∆t) (7.23) | − − | | − |

154 ℑ λˆ = sin(ω∆t), ℜ λˆ = cos(ω∆t) (7.24) −     2 ˆ 2 ˆ λSISL = ℜ λ + ℑ λ (7.25) | | r     where the functions ℜ(x) and ℑ(x) extract the real and imaginary components, respectively, of the complex argument, x, and λˆ represents a root of (7.17). Also, the time phase (or frequency) of the solution is given by:

ˆ iω∆t λ = e− = cos( ω∆t)+ isin( ω∆t)= cos(ω∆t) isin(ω∆t) (7.26) − − −

ℑ λˆ = sin(ω∆t), ℜ λˆ = cos(ω∆t) (7.27) −    

ℑ λˆ sin(ω∆t) = = tan(ω∆t) (7.28) ℜλˆ  −cos(ω∆t) −   ℑ λˆ ℑ λˆ 1 1 1 1 ωSISL = tan− = tan− (7.29) ∆t −ℜλˆ   −∆t ℜλˆ         Because the quartic polynomial has four roots, there are indeed four solutions. Two are deemed “physical” modes, and two are deemed “computational” modes. Typically, the physical modes are distinguished in this context by their near unity amplitudes, and the computational modes are distinguished by their near zero amplitudes. Each pair of physical modes propagate in opposite directions where positive frequencies indicate propagation along the positive axis (propagating up and to the right), and negative frequencies indicate propagation along the neg- ative axis (propagating down and to the left) because the wave numbers are always considered positive. The two physical amplitude solutions may or may not be the same as will be seen in

155 section 7.4, but they will always be or order unity.

7.3 Removal of Spatial Error

It turns out that the terms τ, η, Θ, and Ω together contain all of the influence of spatial error in the SISL solutions. Spatial errors are, therefore, removed from the solution by evaluating the

limit of τ, η, Θ, and Ω as ∆x 0 and as ∆z 0. Trivially, if the grid spacing is brought to → → Θ Ω zero, = = 1 yielding perfect interpolation because the weights, Cr∗ and Cr∗∗, must always sum to unity. The interpolation errors are neglected throughout this study as in SSP05. Taking the limits of τ and η, we obtain: Γ ∂ρ¯ m + iq η = (7.30) ρ¯ ∂z k2

m2 + q2 τ = (7.31) k2

Therefore, comparing the results between (7.19-7.20) and (7.30-7.31), one can gain insight into the magnitude of spatial error introduced in the centered differences and how that interplays with the temporal accuracy in a linear sinusoidal context. It was originally hypothesized that for higher than second-order temporal accuracy, this second-order approximation to derivatives would begin to interfere with the temporal accuracy. However, experimentation shows that there is virtually no change in the SISL solution (compared to the ∆x = ∆z = 500m of SSP05) when the grid spacing is removed, even with the third-order AM scheme. For this reason, it will be removed throughout the remainder of the study. This is perhaps confirmation of

the SSP05 notion that the form of τ in (7.19) suggests a cancellation between horizontal and vertical spatial truncation error for IGW motion.

156 7.4 Relaxation of the Boussinesq Approximation

The Boussinesq approximation was set in place in SSP05 to ensure the solution to the system of equations. Here, the relaxation of this approximation will be explored. This approximation shows up only in the η term in which case, the mean state density, ρ¯, must be prescribed

as some function of height. Because it is well-known that density varies exponentially with height, an analytical expression form η can be given. We’ll start by assuming a typical density

profile where the density is halved every 5.6km.

z/5600 ρ¯ (z)= ρ02− (7.32)

where ρ0 is the surface density which turns out not to play a role in the solution. This also implies that: ∂ρ¯ ln(2) = ρ¯ (z) (7.33) ∂z − 5600 which further implies that the total fractional rate of change of density with height is:

∂ρ 1 ¯ ln(2) 4 1 = 1.2378 10− m− (7.34) ρ¯ ∂z − 5600 ≈ ×

meaning that the relaxation of the Boussinesq approximation (Γ = 1) still yields a solution which is independent of height. The effects of this have been tested directly in the context of the present solutions, and there are some very interesting behaviors. Basically, gravity waves propagating upward are stable in amplitude, and gravity waves propagating downward are unstable. To state it most generally, the SISL method gains awareness of which direction it is propagating vertically, and behavior changes based upon an upward regime versus a downward regime. Of course, in reality, inter- nal waves generally propagate diagonally with both a horizontal and vertical component. The

157 horizontal direction, however, seems to still have no effect because SL solutions move with the mean flow. In retrospect, it was found that this constant value of fractional vertical density gradient was also used in Gill (1980) in the derivation of the vertical structure of gravity waves as used in these model solutions where (1/ρ¯)(dρ¯/dz)= 1/HS = g/(RT0). Assuming this scale height matches the standard atmospheric halving every 5.6km, an isothermal basic state temperature

of T0 = 275.87K is extracted, and this is the value used for the present study. Also, although the derivation of the vertical structure of the gravity waves in Gill (1980) was used in a solution for equations with the Boussinesq approximation, the vertical structure does not require the

Boussinesq approximation but rather is derived via vertical integration over the scale height inverse. Therefore, removing the Boussinesq approximation should be consistent with this system of equations, assumptions, and solutions.

7.5 Analytical Solutions

In SSP05, it is shown that analytical solutions can be derived from the same homogeneous sys- tem which derived the numerical SISL solutions by applying the harmonic assumption without a discretization and differentiating analytically. The harmonic assumption applied in this con- text is similar to (7.16) but includes the time directly and replaces discretized coordinates with

continuous ones as follows:

i[kx+(m iq)z ωat] (u,w) (uˆ,wˆ)e − − = . (7.35)   ω ρ φ ρ φˆ i[kx+(m+iq)z at] ( , )   ˆ, e −    Therefore, when solving the homogeneous system, Ax = 0, by setting the determinant of the matrix equal to zero, det (A)= 0, the dispersion relationship for frequency as a function of wavelengths can be derived. This analytical solution needs to be rederived because the SSP05

158 solution included the Boussinesq approximation which is to be relaxed in this study. Let us start by restating the equations used in this model in Eulerian form by expanding the material derivatives into local changes with advection, we have:

∂u 1 ∂φ = (7.36) ∂t −ρ¯ ∂x

∂w 1 ∂φ ρ δ = g (7.37) ∂t −ρ¯ ∂z − ρ¯

∂ρ ρ¯Λ2w = (7.38) ∂t g ∂u ∂w Γ ∂ρ¯ + = w (7.39) ∂x ∂z −ρ¯ ∂z

Notice that the mean zonal velocity is neglected in these equations because the SISL solution frequency as ∆t,∆x,∆z 0 moves the flow and thus will give the frequency relative to the → mean wind which is also known as the intrinsic frequency. Next, we substitute our harmonic assumptions for the independent variables. Rather than express the elongated equations with these assumptions, we will here define the time and space derivatives of each of the terms as they appear since the derivative of an exponential function returns the same function with a scaling factor. ∂ u,w,ρ,φ { } = iωa u,w,ρ,φ (7.40) ∂t − { } ∂ u,w,ρ,φ { } = ik u,w,ρ,φ (7.41) ∂x { } ∂ u,w { } = i(m iq) u,w (7.42) ∂z − { } ∂ ρ,φ { } = i(m + iq) ρ,φ (7.43) ∂z { }

159 Using these relations, we obtain:

i[kx+(m iq)z ωat] 1 i[kx+(m+iq)z ωat] iωaueˆ − − = ikφˆe − (7.44) − −ρ¯

i[kx+(m iq)z ωat] 1 i[kx+(m+iq)z ωat] g i[kx+(m+iq)z ωat] iωaweˆ − − δ = i(m + iq)φˆe − ρˆe − (7.45) − −ρ¯ − ρ¯ ρ¯Λ2 i[kx+(m+iq)z ωat] i[kx+(m iq)z ωat] iωaρˆe − = weˆ − − (7.46) − g Γ ∂ρ¯ i[kx+(m iq)z ωat] i[kx+(m iq)z ωat] i[kx+(m iq)z ωat] ikueˆ − − + i(m iq)weˆ − − = weˆ − − (7.47) − −ρ¯ ∂z

Dividing out the common terms in each equation, we are able to obtain:

1 2qz ωauˆ = kφˆe− (7.48) − −ρ¯

1 2qz g 2qz iωawˆδ = (im q)φˆe− ρˆe− (7.49) − −ρ¯ − − ρ¯ ρΛ2 2qz ¯ iωaρˆe− = wˆ (7.50) − g Γ ∂ρ¯ ikuˆ +(im + q)wˆ = wˆ (7.51) −ρ¯ ∂z

Next, the equations may all be brought to the left hand side to set up a homogeneous system where the terms have been manipulated to match the form given in SSP05.

1 2qz ωa 0 0 ke uˆ 0 −ρ¯ −  ω δ g 2qz 1 2qz     0 i a ρ¯ e− ρ¯ (im q)e− wˆ 0 − − − = . (7.52)  ρΛ2      ¯ ω 2qz  ρˆ     0 g i ae− 0    0        Γ ∂ρ¯  φˆ     ik im + q + ρ ∂ 0 0    0   ¯ z          For this system to have meaningful (non-zero) solutions, we must set the determinant of the

160 matrix equal to zero, and the following dispersion relationship is derived for the model:

kΛ ωa = (7.53) Γ ∂ρ ± m2 + q2 + δk2 + ¯ (q im) ρ¯ ∂z − q Clearly, the SSP05 solution may be extracted by simply setting Γ = 0 which enables the Boussinesq approximation. However, this analytical expression for the dispersion relationship is more general than that of SSP05 in order to allow not only H or NH specification but also to allow the relaxation of the Boussinesq approximation since the approximation does indeed effect the dispersion relationship. The main issue that arises here is that the new analytical frequency is expressed as a complex number. This can only be interpreted when placed back into the original harmonic assumption context; and in this case, we will express the sinusoidal time solution for some general variable, ψ where ωR represents the real component of the complex frequency and ωI represents the imaginary component of the complex frequency.

2 iωt i(ωR+iωI )t i ωIt iωRt ωIt iωRt ψ = e− = e− = e− e− = e e− (7.54)

What this essentially means is that depending on the sign of the imaginary component of analytical frequency, the amplitude will either amplify or decay exponentially as time in-

creases. It has been found experimentally that the sign of ωI is always opposite of ωR, at least for the experiments performed in the present study. Therefore, for upward propagating gravity waves, the solution decays with time; and for downward propagating gravity waves, the solution amplifies with time. This result qualitatively agrees with the GCM-DC results of SSP05 in which orographically forced gravity waves propagate diagonally upward and decay in oscillation amplitude as they ascend.

This result also helps shed some light into why the AM-2 trapezoidal numerical solutions (which previously rendered perfect amplitude preservation) suddenly began amplifying for

161 downward propagating modes and decaying for upward propagating modes after the Boussi- nesq approximation was relaxed. This effect must be intrinsic to the system of equations which means that the amplification and decaying is not a result of the temporal discretization errors but purely a result of properties within the physical system.

7.6 Extracting Intrinsic Amplification from the Numerical

Solutions

As shown in the previous section, there exists some intrinsic amplification or decay of the wave solution to the system of equations (7.48 - 7.51) which is not a result of the SISL discretization.

For this reason, if only the effect of the SISL solution is to be analyzed, then this intrinsic amplification must be removed from the solution. It is known from the previous section that the amplification applied to any wave solution in this system of equations due to the complex analytical frequency is: ω A = e It (7.55)

where, again ωI represents the imaginary component of the complex analytical frequency. Therefore, this should be divided out from the numerical solution as a normalization mech- anism to remove the natural amplification of the waves. However, as mentioned earlier, ωI may change sign from one root to another which means that it may amplify or decay. So the main issue here is how to know whether a numerical root naturally amplifies or decays so that only SISL error is obtained.

It was found experimentally that the sign of the imaginary component of ωa is always the opposite sign of the sign of the real component. This makes sense from a theoretical

162 examination. Let us consider the positive solution first:

kΛ ωa = (7.56) Γ ∂ρ m2 + q2 + δk2 + ¯ (q im) ρ¯ ∂z − q It is known that the sign of the fractional vertical density gradient (FVDG) is negative which means that the product of FVDG and q is negative, and the sign of the imaginary component

8 2 is positive. The product of FVDG and q is of order 10− m− which is much smaller than 2 2 6 2 k and m which are all of order 10− m− which means that the real value under the square

root will always be positive and will not contribute to the overall imaginary component of ωa. Therefore, the only possible origin of an imaginary component is the already existing imagi- nary component composed of the product of the FVDG and im. Using the identity, i 1 = i, − − since the imaginary number is positive in the denominator, the overall imaginary component will be negative which means that the imaginary component is opposite in sign to the real

component. Therefore, positive frequencies decay naturally, and negative frequencies amplify

naturally, and the magnitude of ωI remains the same in both. With this knowledge, it is clear that if the sign of the numerical frequency is positive, then

ωI ∆t the amplitude solution should be divided by e−| | . Likewise, if the sign of the numerical

ωI ∆t frequency is negative, then the amplitude solution should be divided by e| | . The term, ∆t, is used in place of t because it denotes the time that has passed since the previous time step. A more general solution which is the one implemented in this study can be thought of as:

ω SISL ω ∆ ω I t λSISL = λSISL e SISL | | (7.57) | |normalized | | | |

163 7.7 Obtaining Total Temporal Error Measures

While it is very useful to understand the characteristics of the error separately in terms of amplitude and frequency, it is also quite useful to understand how the two combine to give an overall error measure to see the overall comparison in a linear, harmonic framework. Therefore, as in Lauritzen and Nair (2007), only the real component of the solution is compared between analytical and numerical results. Using only the real component may seem contradictory in that both real and imaginary root components were used in obtaining the amplitude and frequency information. There are two methods of obtaining the amplitude and frequency information in time. First, one could utilize a general complex numerical root finder algorithm to obtain the frequency according to the following equation: ˆ λSISL cos(ωSISL∆t)+ isin(ωSISL∆t)= 0 (7.58) − which renders a complex solution for ωSISL. As mentioned in section 7.5, a complex frequency means that the imaginary component comes out to form an amplification given by (7.55). The second method is the one used in the study using (7.29) and (7.25) to obtain the numerical frequency and amplitude respectively. Both yield equivalent results, but the point that is either method requires both terms to extract the information since the frequency which ensures (7.58) is complex. However, once this information is indeed extracted, we may apply it to a simple cosine argument since the original harmonic assumption only supposed that the solution is ˆ sinusoidal in time. Thus, with the extracted amplitude, λSISL , and the extracted frequency,

ωSISL, the absolute error over an interval can be obtained.

There are many ways to compute this error, and ideally, it would be computed as the mean absolute error (MAE) measure which changes linearly as mean error increases. However, there is no analytical solution to the integral of the absolute difference of two harmonic func- tions. This means that the MAE error measure would require numerical integration which

164 must be performed with monotonic subgrid functions such as with the trapezoidal method which doesn’t converge all that quickly to the true solution. This renders MAE computation- ally infeasible for this purpose. There is, however, an analytical solution to the integral of the squared difference of two harmonic functions, and therefore, the root mean squared error

(RMSE) measure will be used for this purpose in comparing the different SISL schemes. This error measure is given below where τc represents the characteristic time period over which to integrate the error.

τ c ω SISL ℑ ω ∆ ω 2 1 ω ( a) t SISL − SISL | | ℜ ω λ ω RMSE = vτ e | | cos ( a) ω t SISL cos( SISLt) dt. (7.59) u c SISL −| | u Z0   | |   u t Now, there are inherent difficulties in obtaining an absolute error measure via the difference of harmonic waves mainly because of phase relationships. As one integrates over longer and longer time periods, the error itself will not converge because it oscillates with a period defined

by τE = 2π/ ωa ωSISL , and this is an undamped oscillation. The less the phase error in the | − | numerical solution, the longer the error period of oscillation will be. For instance, suppose the analytical solution is characterized by

2π λ = sin t , (7.60) a π/3   and the numerical solution is characterized by

2π λn = sin t (7.61) π/3.1  

(that is, an analytical period of 3 and a numerical period of 3.1) shown in Fig. 7.1a. Then, the

error period will be τE = 93 seconds, and the plot of error over this period is shown in Fig. 7.1b.

165 2 1

1 0.5 Function Value Function Difference

0 0 0.5 1 1.5 2 2.5 3 20 40 60 80

Time (s) Time (s)

±0.5 ±1

±1 ±2 (a) Idealized numerical (dashed) and analytical (solid) (b) Associated Error solutions

Figure 7.1: Plots of hypothetical analytical and numerical functions given by (7.60) and (7.61) respectively.

166 Now, although the error does not converge, the root mean squared error (RMSE) does converge to a value, though this value in no way signifies the performance of a particular method regarding phase. For instance, consider Fig. 7.2 which shows a plot of the RMSE over time. Since the error itself oscillates, the RMSE does as well. However, the RMSE is a damped oscillation because more and more time is being included in the overall mean. It

1/2 is visibly evident that the damping is a function of t− . The value which the RMSE tends toward as time goes to infinity is exactly the value which is obtained after one period of error

oscillation (or one half period of error oscillation as it is mirrored about τE/2) which is always unity regardless of the difference in phase. In the previous statement, the amplitude is assumed to be perfect in the numerical solution, but even still it shows that using the limitas time goes to infinity does not accurately convey phase errors nor does the RMSE computed near or greater than τE/2. Therefore, the RMSE must be computed over a time period much less than τE/2 if both phase and amplitude aspects of error are to contribute to the overall integrated mean error.

It turns out that for methods which retain very good phase error, τE can be on the order of 109 seconds which clearly would not be a problem if one were to integrate over about a period of a day. However, for poor phase-preserving methods such as the AM-1 hydrostatic

3 scheme, τE can be on the order of 10 seconds which is less than one day. Therefore, to obtain robust comparative total error measures, only one period of analytical wave oscillation will be used for the RMSE integration to ensure that the RMSE is computed over a period much less

1 than τE/2 for all of the AM methods. For NH (H) simulations at with k = m = 2π/5000m− ,

τc = 741s τc = 525s. Additionally, when more than two terms non-zero terms appear in the forcings of the dy-

namical system (for example, in the AM-3 scheme), two computational modes and two phys- ical modes manifest each set with one mode propagating upward and one mode propagating downward. If the total error is to truly be obtained, we must know the effect of both of these modes on the total solution. Since we have already assumed the data and the results are har-

167 1

0.8

0.6 RMSE

0.4

0.2

0 2 4 6 8 10

Number Error Periods

Figure 7.2: RMSE between (7.60) and (7.61) as a function of time. The x-axis denotes the number of error periods, τE over which the RMSE is integrated.

168 monic, it is sufficient to simply add together the computational and physical waves which propagate in the same direction. It turns out that the computational frequencies tend to infinity as ∆t 0 and the computational amplitudes tend to zero as ∆t 0 (as stated in SSP05). This → → means that for sufficiently small time steps, the computational mode simply adds high fre- quency, low amplitude noise to the solution and affects the error by a relatively small amount. However, as the time step increases, this computational mode becomes much closer to the nat- ural amplitude and frequency of the solution and interferes more with the accuracy. Therefore, with λc and ωc representing the computational amplitude and frequency numerical solutions and λp and ωp representing the physical amplitude and frequency numerical solutions, the total combined error can be expressed as:

τ c ω SISL ℑ ω ∆ ω 2 1 ω ( a) t SISL − SISL | | ℜ ω λ ω λ ω RMSE = vτ e | | cos ( a) ω t [ c cos( ct)+ p cos( pt)] dt u c SISL − u Z0   | |   u t (7.62)

7.8 Uncentering Methods

Here, formally first- and second- order uncentering methods will be presented as given in Nair and Tufo (2007). It was reviewed earlier that there exists a spurious resonance between SISL time integration schemes and certain IGW length and time scales which is particularly known to be forced by orography. This mathematical condition for this resonance was formulated in Nair and Tufo (2007), and a method of keeping this condition from occurring in the general case was obtained by off-centering the coefficients of the forcing terms. The first-order in time formulation is given by

N+1 N ψ ψ 1 + ε f 1 ε f − = FN+1 + − FN (7.63) ∆t 2 2

169 or by the previous formulation in (7.10) with α1 = 1 + ε f /2, α2 = 1 ε f /2, α3 = 0 where − ε f is the first-order uncentering parameter, usually between zero and one half. Analyzing the form of this, it is immediately apparent that the time discretization with

ε f = 0 is fully second-order in time as in the AM-2 or trapezoidal method.

N+1 N ψ ψ 1 ε f 1 ε f − = + FN+1 + FN (7.64) ∆t 2 2 2 − 2    

In fact, as shown by this form, it is clear that the first-order uncentered method is simply a perturbation off the second-order AM scheme. However, with this perturbation in coefficients, the convergence formally degrades to first-order. It should be noted that the effects of this uncentering are extremely sensitive to the uncentering parameter. In Bates et al. (1993), it is

suggested that ε f = 0.07 be used. However, in Nair and Tufo (2007), it is suggested that ε f = 0.4 be used; and there is a very large difference in the characteristics of these two parameters.

With ε f = 0.07, the system behaves almost identically to second-order. With ε f = 0.4, it behaves as one might expect: as an average of first- and second-order AM methods. Therefore, in this study, when uncentering is considered, both of these first-order parameters will be tested.

The temporally second-order formulation is given by

N+1 N ψ ψ 1 + εs 1 2εs εs − = FN+1 + − FN + FN 1 (7.65) ∆t 2 2 2 −

or by equation (7.10) with α1 = (1 + εs)/2, α2 = (1 2εs)/2, α3 = εs/2 where εs is the − second-order uncentering parameter. Again, it is very interesting to see that as before, this relationship can be expressed as a perturbation off the AM-3 method as shown below where

εs = εˆs 1/6. −

ψN+1 ψN ε ε 1 s N+1 1 N s N 1 − = + F + εs F + F − (7.66) ∆t 2 2 2 − 2      

170 ψN+1 ψN ε 1 ε 1 1 ˆs 6 N+1 1 1 N ˆs 6 N 1 − = + − F + εˆs F + − F − (7.67) ∆t 2 2 2 − − 6 2 !    ! ψN+1 ψN ε ε 5 ˆs N+1 2 N 1 ˆs N 1 − = + F + εˆs F + + F − (7.68) ∆t 12 2 3 − −12 2      

What this basically shows is that by setting εs = 1/6, the AM-3 method is obtained. There- − fore, in theory, a second-order uncentering provides a mixture of second- and third-order char- acteristics, but the perturbation off the third-order formulation renders the method formally only second-order. Only one second-order uncentering parameter seems to come up opera-

tionally in literature, and that value is εs = 0.5 (εˆs = 2/3), and this will therefore be the value used in the present study when uncentering is considered. Additionally, it is interesting to consider the possible implications of this off-centered per- turbation of the third-order method which renders the Nair and Tufo (2007) second-order un- centering scheme. If a value of εs = 1/6, which is a non-zero parameter and should theo- − retically help damp spurious oscillations, renders the formally third-order AM method, does the AM-3 method even need any trajectory uncentering to control spurious oscillations? This would be a very interesting topic to study operationally in the future, but it may be that a stable resonance-free third-order solution can be obtained with only three forcing terms in the AM-

3 method. Again, this would, of course, require future study and is beyond the scope of the present thesis.

171 Chapter 8

Results

Here, the methods in the previous section will be employed to analyze the temporal error characteristics of the Adams-Moulton SISL solutions applied to a linearized harmonic model of Internal Gravity Waves (IGWs). It should be noted beforehand that certain terms such as “stable” and “unstable” will be used regarding the amplitude characteristics. All amplitudes are assumed to begin at unity and are altered from there during the SISL solve. The term

“unstable” in this context, therefore, refers to a resulting numerical amplitude greater than unity which constitutes an exponential increase in amplitude as more and more time steps are taken. Similarly, “stable” denotes a numerical amplitude which is less than unity as the solution would decay with time towards zero. It should also be noted that a temporal numerical instability in no way means the implemented method would be unstable because interpolation (as mentioned in the introduction, this is an integral part of SL methods) is known to always be diffusive in nature which would act in tandem with the temporal characteristics to provide an overall stability of the method. Additionally, no true data is completely harmonic in nature, and thus would behave somewhat differently. However, for IGW motion, the harmonic assumption provides fairly realistic and valuable insight into the characteristics of the SISL method.

172 8.1 AM-1 Results: Implicit Euler Method

Here, the coefficients αi are assumed to be: α1 = 1, α2 = 0, and α3 = 0 as applied to the

(7.17). Because α3 = 0, µ0 = µ1 = 0 , and the polynomial is only quadratic in nature since two roots are now guaranteed to be zero. At least two roots are required for a solution con- taining gravity waves because they propagate hyperbolically with one frequency positive and one frequency negative. Therefore, clearly both roots extract are physical modes, and there are no computational modes associated with this method. This particular equation is very easy to solve analytically using the quadratic formula to give the two roots of interest in complex space: µ µ2 µ µ 3 3 4 4 2 λˆ = − ± − . (8.1) q2µ4

Fig. 8.1 shows a plot of the amplitude after one time step for hydrostatic (H) and non- hydrostatic (NH) solutions with and without the Boussinesq approximation to show the effect of these assumptions on first-order accuracy in amplitude. In this experiment, horizontal and vertical wavelengths of Lx = 5km and Lz = 5km respectively are used to remain consistent with SSP05. Clearly, as also shown in SSP05, the NH solutions suffer less damping than the H solutions. Additionally, and this is a new result, the removal of the Boussinesq approximation in the continuity equation yields two different solutions for amplitude depending on which direction the gravity wave propagates (a property manifested by the sign of frequency of the particular root). The upward propagating wave preserves amplitude slightly better than the downward propagating wave for both H and NH solutions. Another way of stating this which will link it to the higher-order approximations is that the upward propagating wave is more unstable than the downward propagating wave. Still, clearly, for the AM-1 scheme, all temporal amplitudes are stable. Fig. 8.2 plots the positive and negative numerical frequencies magnitudes for the AM-1

173 1

0.95

0.9

0.85

0.8

0.75 Amplitude

0.7 H Bouss H No Bouss Positive 0.65 H No Bouss Negative NH Bouss NH No Bouss Positive 0.6 NH No Bouss Negative

0.55 0 20 40 60 80 100 120 Time Step

Figure 8.1: Solution amplitudes for AM-1 simulations. H (NH) stands for a hydrostatic (non- hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave solution propagating downward.

174 method over a large range of time steps with the same length scales of Lx = Lz = 0. This helps judge the performance of the AM-1 scheme in the other property of harmonic output which is frequency. There are analytical values for the frequency as given by the calculated dispersion relationship in (7.56). Only the real part of this dispersion relationship is desired because the imaginary component gives amplitude information and not frequency information. The rela- tive difference in analytical frequency with and without the Boussinesq approximation is only

4 3 O 10− for NH motions and O 10− for H motions which is quite small relative to numeri- cal errors. Therefore, when judging from the graph in Fig. 8.2 the relative errors, keep in mind that the numerical difference in frequency with and without the Boussinesq approximation is much larger than the analytical difference which is also evident as they converge to almost the same values graphically. Therefore, the plot gives an accurate indication of overall frequency accuracy. Also, when viewing the plot as with other frequency plots, be aware that a negative mode appearing on the positive axis is a reflection about ω = 0 (multiplying by 1) to compare with the posi- − tive modes. Though they are mirrors analytically, they are not mirrored numerically. Clearly, again, the NH simulations contain much less error than do the H simulations regarding fre- quency preservation. What is interesting to note is that though the positive amplitudes are more accurate in the AM-1 method, the frequency preservation is worse. There is an inverse relationship manifesting here between amplitude and frequency where for one direction the SISL method enhances amplitude and degrades frequency relative to the Boussinesq solution and the other experiences the opposite effect. Fig. 8.3 gives a plot of the H and NH relative numerical SISL errors with and without the

Boussinesq approximation and for positive and negative propagating modes of IGWs. Clearly, the NH regime is more accurate than the H regime. Additionally, the SISL method seems to resolve upward IGW motion more accurately than downward IGW motion by a relatively slight amount. This seems to suggest that the amplitude is playing more of a role in the RMSE

175 0.012

0.011

0.01 H Bouss NH Bouss H No Bouss Positive H No Bouss Negative 0.009 NH No Bouss Positive NH No Bouss Negative Frequency 0.008

0.007

0.006 0 20 40 60 80 100 120 Time Step

Figure 8.2: Solution frequencies for AM-1 simulations. H (NH) stands for a hydrostatic (non- hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Negative modes have been reflected about ω = 0 for direct visual comparison with positive modes.

176 0.8 H Bouss H Positive No Bouss 0.7 H Negative No Bouss NH Bouss NH Positive No Bouss 0.6 NH Negative No Bouss

0.5

0.4 RMSE

0.3

0.2

0.1

0 0 20 40 60 80 100 120 Time Step

Figure 8.3: Numerical errors for AM-1 simulations. H (NH) stands for a hydrostatic (non- hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave solution propagating downward. measure than the frequency since the positive mode is better in amplitude and worse in fre- quency. However, as will be seen, these values are improved upon greatly by increasing the order of convergence. The temporal order of convergence as manifested by the ratio of the NH

Boussinesq solution at 60 seconds and 120 seconds is about 1.38 as given by the relation:

RMSE (2∆t) RMSE (2∆t) 2N = N = log (8.2) RMSE (∆t) ⇒ 2 RMSE (∆t)  

Clearly, however, judging from the curvature and inflection of the error as a function of time step, the RMSE used in this study gives greater than first-order convergence for smaller time steps and less than first-order convergence for larger time steps with the inflection (first-order convergence point) near ∆t = 60seconds.

177 8.2 AM-2 Results: Trapezoidal Method

Here, the coefficients αi are assumed to be: α1 = 1/2, α2 = 1/2, and α3 = 0. This method is also more commonly known as the trapezoidal method as an integration technique. Again, because α3 = 0, µ0 = µ1 = 0 which once again means the polynomial is only quadratic in nature since two roots are guaranteed to be zero. These roots can be calculated trivially using (8.1). Fig. 8.4 gives the amplitudes of the two roots for the AM-2 SISL scheme for H and NH simulations with and without the Boussinesq approximation. As is true with the AM-1 scheme, the positive (upward propagating) IGW modes are more unstable than the negative

(downward propagating) ones. In this case, as the original Boussinesq SISL solution rendered perfect amplitude (a common property for second-order methods), the non-Boussinesq SISL solution is formally unstable. Clearly, both H and NH amplitudes are much closer to unity than for the AM-1 methods. Also, the relative increase in amplitude from H to NH solutions is larger than with the AM-1 case. Here, the NH is five times more accurate than the H solution. In the AM-1 case, the NH solution was only about 1.5 times more accurate than the H solution. Next, Fig. 8.5 plots the AM-2 SISL frequency solutions as a function of increasing time step. What is very interesting to note is that both Boussinesq and non-Boussinesq upward and downward propagating solutions have indeed been plotted. However, there is almost no

deviation in the frequency imposed by relaxing the Boussinesq approximation as there was in the AM-1 method. As will be evidenced in the following section as AM-3 frequencies once again separate the accuracy of negative and positive frequency modes, perhaps this is a property common to even-ordered schemes contrary to that of odd-ordered schemes. Clearly, both H and

NH solutions remain more accurate than in the AM-1 case as the time step increases. Finally, the total error including both amplitude and frequency effects is plotted in Fig. 8.6 as a function of increasing time step. What seems most immediately clear is that there is a much greater advantage in going to a NH regime at larger time steps than for the AM-1

178 1.025

1.02

1.015

1.01

1.005

1

Amplitude 0.995 H Bouss H No Bouss Positive 0.99 H No Bouss Negative NH Bouss 0.985 NH No Bouss Positive NH No Bouss Negative 0.98

0.975 0 20 40 60 80 100 120 Time Step

Figure 8.4: Solution amplitudes for AM-2 simulations. H (NH) stands for a hydrostatic (non- hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave solution propagating downward.

179 0.012

0.0115

0.011

0.0105

0.01 H Bouss H No Bouss Positive H No Bouss Negative 0.0095 NH Bouss

Frequency NH No Bouss Positive 0.009 NH No Bouss Negative

0.0085

0.008

0.0075 0 20 40 60 80 100 120 Time Step

Figure 8.5: Solution frequencies for AM-2 simulations. H (NH) stands for a hydrostatic (non- hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Only positive modes shown.

180 method. Also, these errors are an order of magnitude smaller than the AM-1 errors which shows clearly that there is a definite advantage to increasing the order of temporal accuracy. Again, this is an idealized situation, and nothing has yet been said on the effects of uncentering for the AM-2 scheme. Still, the pure AM-2 scheme shows significant advantages over the AM-

1 scheme in all aspects of accuracy in simulation IGW motion in the atmosphere. The temporal order of convergence as manifested by the NH Boussinesq error at 60 and 120 seconds is about 1.90, and this (unlike the AM-1 scheme) is generally constant over time step. This is likely because with the AM-1 method, frequency was poorly calculated numerically, and thus the error oscillation period was much smaller (see section 7.7). The AM-2 method, resolving frequency more accurately likely has an analytical RMSE integration period relatively much smaller in comparison to the error oscillation period which gives a somewhat more meaningful and more accurate result.

8.3 AM-3 Results

Here, the coefficients αi are assumed to be: α1 = 5/12, α2 = 2/3, and α3 = 1/12 which − results in full quartic behavior of the system meaning there will exist two physical and two computational modes, and both sets of modes will have one propagating upward (positive frequency) and one propagating downward (negative frequency). This quartic equation can be

solved either analytically or numerically, and in this case, Maple software has been employed to numerically calculate the roots with 15 decimal digits of accuracy in the calculations. Fig. 8.7 shows the physical and computational mode amplitudes for the AM-3 simulations as a function of time step. This is the first method presented so far which actually renders non-zero

computational modes, and as mentioned in the methods chapter, these two harmonic solution modes may simply be summed to gain a total solution (assuming the directions of propagation are matched in the sum). Clearly, the magnitude of the computational amplitude remains small

181 0.4 H Bouss H Positive No Bouss 0.35 H Negative No Bouss NH Bouss NH Positive No Bouss 0.3 NH Negative No Bouss

0.25

0.2 RMSE

0.15

0.1

0.05

0 0 20 40 60 80 100 120 Time Step

Figure 8.6: Relative numerical errors for AM-2 simulations. H (NH) stands for a hydrostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave solution propagating downward.

182 for smaller time steps with an asymptote of zero as ∆t 0. Both H and NH solutions of the → physical mode are formally unstable for this model. What is very interesting to note is that the physical NH mode keeps its accuracy high for much larger time steps than the H counterpart which increases much more steeply at a time step of say 60 seconds. Additionally, the non-

Boussinesq amplitude split is smaller for NH regimes. What is interesting to note is that for NH solutions, the physical mode is more accurate and the computational mode is smaller in amplitude which when combined will likely mean that the NH solution presents an even larger gain in accuracy than seen before in other methods. This tendency seems to suggest that the higher the order of convergence of a method, the more the accuracy that is gained by performing NH simulations. Next, Fig. 8.8 shows a plot of AM-3 solution frequencies for physical and computational modes as a function of increasing time step. The physical NH AM-3 frequency is a very impressive result in comparison to AM-2 and AM-1 methods. Additionally, it becomes clear that the nature of the AM-3 computational modes are basically high frequency, low amplitude noise which is overall a beneficial characteristic since such solutions will not interfere much with accuracy at reasonable time steps relative to the magnitude of the physical error. The NH computational mode is slightly higher frequency than the H computational mode which is theoretically an improvement since the NH analytical solution is smaller in magnitude than the H analytical solution to frequency. That represents an improvement upon H simulations because a higher frequency computational mode implies that it simply oscillates about the physical mode with smaller wavelength. In Fig. 8.9 the RMSE for the AM-3 scheme is plotted as a function of increasing time step for the physical solution and the combined solution (summing the physical and computational harmonic modes). There is a strange behavior in the non-Boussinesq modes relative to the AM-1 and AM-2 methods for the physical mode error in that for small time steps, the non- Boussinesq solutions are both significantly less accurate than the Boussinesq solution. Yet, at

183 1.14

1.12

1.1

1.08 H Bouss H No Bouss Positive H No Bouss Negative 1.06 NH Bouss Amplitude NH No Bouss Positive NH No Bouss Negative 1.04

1.02

1 0 20 40 60 80 100 120 Time Step (a) Physical Modes

0.1 H Bouss 0.09 H No Bouss Positive H No Bouss Negative 0.08 NH Bouss NH No Bouss Positive 0.07 NH No Bouss Negative

0.06

0.05

Amplitude 0.04

0.03

0.02

0.01

0 0 20 40 60 80 100 Time Step (b) Computational Modes

Figure 8.7: Solution amplitudes for AM-3 simulations. H (NH) stands for a hydrostatic (non- hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave solution propagating downward.

184 0.012

0.0115

0.011

0.0105

0.01 H Bouss H No Bouss Positive H No Bouss Negative Frequency 0.0095 NH Bouss NH No Bouss Positive NH No Bouss Negative 0.009

0.0085

0.008 0 20 40 60 80 100 120 Time Step (a) Physical Modes

0.05

0.04

0.03

Frequency 0.02 H Bouss H No Bouss Positive H No Bouss Negative NH Bouss 0.01 NH No Bouss Positive NH No Bouss Negative

0 10 20 30 40 50 60 70 80 90 100 Time Step (b) Computational Modes

Figure 8.8: Solution frequencies for AM-3 simulations. H (NH) stands for a hydrostatic (non- hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Negative modes have been reflected about ω = 0.

185 larger time steps, the Boussinesq error as before represents the middle ground of the positive and negative non-Boussinesq mode errors. This phenomenon shows up to a much smaller extent in the AM-2 simulation. Analyzing the errors more closely, it turns out that the negative mode in the AM-3 simulation has worse frequency preservation than the Boussinesq value but better amplitude preservation. Also, the positive non-Boussinesq mode has similar frequency preservation to the Boussinesq solution but worse amplitude preservation. For both effects, the aspect that is degraded relative to the Boussinesq solution is larger in magnitude than the aspect which improves upon the Boussinesq solution. Thus, the overall error reflects degradation over smaller time steps for the non-Boussinesq modes relative to the Boussinesq.

Fig. 8.9b shows the combined solution error with physical and computational modes summed. It becomes more clear how the computational mode is affecting the accuracy. At small time steps, it only represents high frequency oscillations about the physical mode. How- ever, at larger time steps, the frequency magnitude decreases exponentially toward the analyt- ical (and thus, the physical) value and quickly joins the physical mode in causing a greater magnitude of degradation in error. Clearly, this happens at a greater time step for the NH solu- tion relative to the H solution which suggests an added benefit of simulation NH dynamics. The combined solution error as a function of time step exhibits a very interesting behavior in that there seem to be two distinct regimes of convergence, both of them linear in nature. For the NH solution, from zero to about 90 seconds, there seems to be a linear convergence (relatively constant slope) with a very shallow slope, and after 90 seconds, that slope seems to become much larger. This is true for the H solution as well except that this switch in “regime” occurs at a smaller time step, and both slopes are greater which means that as time step in- creases. This may suggest that although the physical mode convergence is third-order in nature, when the computational mode is included, perhaps the nature of convergence is deduced to lin- ear with a much smaller error and slope. The switch in regime could be most characterized by the point at which computational mode frequency becomes of the same order of magnitude as

186 the physical mode frequency and no longer represents a perturbation about the physical mode error but a significant addition to the physical mode error. The convergence of the physical Boussinesq AM-3 scheme judging from the 60s and 120s error is actually about 4.03 which is, in fact, an order of magnitude higher than was expected since this method is generally thought to be a third-order scheme. The 60-120s convergence of the negative non-Boussinesq mode, however, is only about 2.56 which shows that in a real context, the convergence is not always fourth order. Regardless, in comparison to the AM-1 and AM-2 schemes, the actual magnitude of error is much less which represents a significant potential gain in accuracy in the AM-3 method were to be used in place of traditional second- order uncentered methods.

8.4 Uncentering Results

8.4.1 First-Order Results

Next, it is desirable to quantify the effects of uncentering on amplitude, frequency, and com- bined RMSE for the AM methods. As mentioned in section 7.8, first-order uncentering is given by (7.63) and has been referenced in literature with the uncentering coefficients ε f = 0.07 and

ε f = 0.4. Fig. 8.10 shows the amplitude as a function of increasing time step for AM-1 first- order uncentering with both of the aforementioned uncentering parameters. Particularly in the ε f = 0.07 hydrostatic non-Boussinesq positive solution, it is interesting to note the obvi- ous inflection point in the amplitude as time step increases. This inflection is present in both

ε f = 0.07 non-Boussinesq positive solutions but is particularly evident in the hydrostatic solu- tions. This is manifesting the fact that such a small deviation from second-order accuracy still renders second-order characteristics in the solution. Second-order positive mode amplitudes increase unstably as time step increases (see Fig. 8.4), and this is showing the Fig. 8.10a as

187 0.2 H Bouss H Positive No Bouss H Negative No Bouss NH Bouss NH Positive No Bouss 0.15 NH Negative No Bouss

0.1 RMSE

0.05

0 0 20 40 60 80 100 120 Time Step (a) Physical Mode Error

0.2 H Bouss H Positive No Bouss H Negative No Bouss NH Bouss NH Positive No Bouss 0.15 NH Negative No Bouss

0.1 RMSE

0.05

0 0 20 40 60 80 100 120 Time Step (b) Combined Physical and Computational Mode Error

Figure 8.9: Relative numerical errors for AM-3 simulations. H (NH) stands for a hydrostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave solution propagating downward.

188 well, especially at larger time steps. Additionally, the amplitude is far more accurate than the original first-order solution (see Fig. 8.1). There seems to be much greater relative deviation in the non-Boussinesq positive and negative solutions for ε f = 0.07 than for ε f = 0.4 or the

original first-order case. Lastly, the ε f = 0.4 solution exhibits poorer accuracy than ε f = 0.07 but better accuracy than the original first-order method regarding amplitude. This is expected as this is in a way, nearly a linear average of first- and second-order methods.

Next, analyzing the frequency solutions, it becomes very clear how similar ε f = 0.07 un- centering is to the second-order method. Fig. 8.5 shows that there is extremely little relative deviation in frequency as time step increases between positive and negative non-Boussinesq

solutions. Clearly, Fig. 8.11a shows the same. Again, the frequency accuracy is slightly worse

than second-order for ε f = 0.07 and much better than original first-order solution; and again,

the ε f = 0.4 solution represents a middle ground between first- and second-order accuracy and characteristics as the positive / negative non-Boussinesq split is slightly less severe and the accuracy is nearly the average of first- and second-order AM methods. Looking at the overall accuracies in Fig. 8.12 of the schemes in the RMSE measure, it is clear that if ε f = 0.07 sufficiently damps spurious resonance of IGW motion forced by orog-

raphy in a model, it is greatly preferable to ε f = 0.4, especially in the NH regime of motion. The errors only convey what has already been said which is that the first-order uncentering methods give a mixture of first- and second-order characteristics and also give errors in be- tween that which is seen in purely first- and second-order AM methods. The 60-120s order of

convergence for ε f = 0.07 is 1.87 in comparison to the 1.90 order of convergence of the AM-2 scheme which demonstrates that the method really is essentially second-order in time as far as

realistic numerics are concerned.

189 1

0.99

0.98

0.97 H Bouss H No Bouss Positive H No Bouss Negative 0.96 NH Bouss Amplitude NH No Bouss Positive NH No Bouss Negative 0.95

0.94

0.93 0 20 40 60 80 100 120 Time Step

(a) ε f =0.07

1

0.95

0.9

0.85

Amplitude H Bouss H No Bouss Positive H No Bouss Negative NH Bouss 0.8 NH No Bouss Positive NH No Bouss Negative

0.75 0 20 40 60 80 100 120 Time Step

(b) ε f =0.4

Figure 8.10: Numerical amplitudes for uncentered AM-1 simulations. H (NH) stands for a hydrostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave solution propagating downward.

190 0.012

0.0115

0.011 H Bouss H No Bouss Positive 0.0105 H No Bouss Negative NH Bouss NH No Bouss Positive 0.01 NH No Bouss Negative

0.0095 Frequency

0.009

0.0085

0.008

20 40 60 80 100 120 Time Step

(a) ε f =0.07

0.012

0.0115

0.011 H Bouss H No Bouss Positive 0.0105 H No Bouss Negative NH Bouss NH No Bouss Positive 0.01 NH No Bouss Negative

0.0095 Frequency 0.009

0.0085

0.008

0.0075 20 40 60 80 100 120 Time Step

(b) ε f =0.4

Figure 8.11: Numerical frequencies for uncentered AM-1 simulations. H (NH) stands for a hydrostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave solution propagating downward.

191 0.4 H Bouss H Positive No Bouss 0.35 H Negative No Bouss NH Bouss NH Positive No Bouss 0.3 NH Negative No Bouss

0.25

0.2 RMSE

0.15

0.1

0.05

0 0 20 40 60 80 100 120 Time Step

(a) ε f =0.07

0.5 H Bouss H Positive No Bouss H Negative No Bouss NH Bouss 0.4 NH Positive No Bouss NH Negative No Bouss

0.3 RMSE 0.2

0.1

0 0 20 40 60 80 100 120 Time Step

(b) ε f =0.4

Figure 8.12: Numerical errors for uncentered AM-1 simulations. H (NH) stands for a hy- drostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave solution propagating downward.

192 8.4.2 Second-Order Results

Next, the effects of uncentering on second-order solutions will be analyzed graphically. In literature, only one second-order uncentering coefficient is typically mentioned, and that is the value εs = 0.5. The equation for second-order uncentering is given by (7.65). Fig 8.13 shows a plot of second-order uncentered amplitude versus time step. The amplitude preservation is clearly worse than the pure AM-2 method, but it is still significantly better than first-order. There is also less difference between H and NH amplitudes than in the AM-2 case. Looking at Fig. 8.14, one can see that the real story is the degradation in frequency preservation accuracy when uncentering is applied in a second-order context. These frequency errors are nearly that of the AM-1 method which basically means that the phase is more likely to dominate the overall error for second-order uncentering. It is clear that the second-order behavior of a very small non-Boussinesq negative / positive mode splitting in frequency preservation, but other than that, this method looks little like the AM-2 method. Fig. 8.15 shows the RMSE measures for the second-order uncentered physical mode and combined solutions. Particularly, it is the H solution which suffers the greatest accuracy degra- dation in the second-order uncentering. The NH errors are nearly the same as with the AM-1 method. It is obvious that the second-order uncentering implicitly damps gravity wave modes to a great extent which is a positive result when waves of the scale of Rossby waves are those of greatest interest. However, in the case of smaller-scale motion, this is an undesirable result when these types of features are now of interest as they are well-known to have a large impact on mean flow particularly through the phenomenon of IGW breaking. Looking at the combined solution also, it is interesting to note that though both H and NH solutions became worse, it seems that at around ∆t = 100s, the H solution error meets a temporary cap and improves. It is nevertheless evident that the computational mode is adding higher-frequency noise to the solution at small time steps and really begins to impinge on accuracy at larger time steps as the

193 1

0.98

0.96

0.94 Amplitude 0.92 H Bouss H No Bouss Positive H No Bouss Negative 0.9 NH Bouss NH No Bouss Positive NH No Bouss Negative 0.88

0 20 40 60 80 100 120 Time Step

Figure 8.13: Numerical amplitudes for uncentered AM-2 simulations. H (NH) stands for a hydrostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave solution propagating downward.

194 0.012

0.011

H Bouss 0.01 H No Bouss Positive H No Bouss Negative NH Bouss NH No Bouss Positive 0.009 NH No Bouss Negative Frequency

0.008

0.007

20 40 60 80 100 120 Time Step

Figure 8.14: Numerical frequencies for uncentered AM-2 simulations. H (NH) stands for a hydrostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave solution propagating downward.

195 frequency approaches the order of physical value magnitudes.

8.4.3 Intercomparison

In order to wrap up the analysis and gain an overall perspective on the accuracies of the dif- fering methods, Fig. 8.16 provides a composite of the NH Boussinesq solutions of the AM-1,

AM-2, AM-3, and first- and second-order uncentered methods. Recall that for Boussinesq so- lutions, the positive and negative frequency modes behave identically; and this is the reason they are plotted since it has been generally shown that error magnitudes are similar. The AM-1, AM-2, and AM-3 relative performances (at least regarding the physical modes) is not partic- ularly surprising as one would naturally expect them to progressively become more and more accurate. What is a positive result is that the combined AM-3 solutions is still the most accu- rate for larger time steps (greater than 40 seconds). It is interesting that the AM-3 combined solution has greater error for smaller time steps, and this can be attributed to the influence of the computational mode. Even though the computational mode has measurable amplitude at small time steps, it increases sub-linearly with increasing time step which allows the large time-step solution to remain the most accurate. What is odd is that the second-order uncentered scheme performs so poorly in this RMSE measure. The physical mode alone is roughly the same as the AM-1 scheme, and the com- bined scheme is actually worse. It should be noted, however, that the second-order uncentered scheme with εs = 0.5 is very efficient because the forcing coefficient, α2, is zero which means that forcing term needs no computation. Additionally, this result is not comparable to other results for larger scale phenomena such as Rossby waves since the dynamics are very differ- ent as well as the scale of motion. It has been demonstrated in Nair and Tufo (2007); Cote et al. (1998) that the second-order uncentered scheme indeed outperforms even the first-order

ε f = 0.07 scheme for Rossby wave motion. In the present study, it is shown that the second-

196 0.9

0.8

0.7 H Bouss H Positive No Bouss 0.6 H Negative No Bouss NH Bouss NH Positive No Bouss 0.5 NH Negative No Bouss

RMSE 0.4

0.3

0.2

0.1

0 0 20 40 60 80 100 120 Time Step (a) Physical Mode

0.7 H Bouss H Positive No Bouss H Negative No Bouss 0.6 NH Bouss NH Positive No Bouss NH Negative No Bouss 0.5

0.4 RMSE 0.3

0.2

0.1

0 0 20 40 60 80 100 120 Time Step (b) Combined Physical and Computational Modes

Figure 8.15: Numerical errors for uncentered AM-2 simulations. H (NH) stands for a hy- drostatic (non-hydrostatic) solution. Bouss (No Bouss) denotes a solution with (without) the Boussinesq approximation. Positive denotes a gravity wave solution propagating upward, and negative denotes a gravity wave solution propagating downward.

197 0.6 AM-1 NH Centered AM-1 NH Uncentered eps=0.07 AM-1 NH Uncentered eps=0.4 AM-2 NH Centered 0.5 AM-2 NH Uncentered Physical AM-2 NH Uncentered Combined AM-3 NH Centered Physical AM-3 NH Centered Combined 0.4

0.3 RMSE

0.2

0.1

0 0 20 40 60 80 100 120 Time Step

Figure 8.16: Relative numerical error comparison for AM-1, AM-2, AM-3, and first- and second-order uncentered NH Boussinesq simulations. order uncentered scheme does not perform as well for IGW motion. The ε f = 0.07 first-order uncentered scheme performed nearly identical to the AM-2 scheme which is expected since

they differ in forcing coefficients by a mere 0.035. The ε f = 0.4 first-order uncentered scheme seemed to edge more on the side of the AM-2 scheme in terms of accuracy than the AM-1 scheme. Even still, it is much greater than the AM-3 combined scheme in terms of RMSE.

198 Chapter 9

Conclusions and Future Work

In this part of the present thesis, a linear SISL 2-D model has been developed with harmonic input following Semazzi et al. (2005) and Gill (1980). Additionally, the Boussinesq approxi- mation was removed by prescribing an analytical function of density with height, namely an exponential function. The fractional vertical density gradient was therefore a constant render- ing a solution which is independent of height. Solutions were derived by prescribing first-, second-, and third-order Adams-Moulton schemes for the temporal discretization and centered second-order derivative discretizations. Spatial error was then removed from the system by calculating the limit of the terms associated with spatial error as grid spacing tends to zero. Then, analytical solutions were derived by differentiating the harmonic input and dropping common terms. For both numerical and analytical models, the solutions were gained by set- ting up a homogeneous system in terms of the dependent variables and setting the determinant of the matrix to zero rendering a general quartic polynomial, the roots of which describe the solutions.

It was found in the analytical non-Boussinesq derivation that the system of IGW motion naturally amplifies (decays) in time for downward (upward) propagating modes. Therefore, this natural behavior was extracted from the system by simple division to obtain purely that

199 which is modified by the numerical SISL treatment in time. It was found that the SISL schemes all treated non-Boussinesq positive and negative modes differently, and this difference in be- havior varied from method to method. The AM-1, AM-2, and AM-3 methods progressively became more and more accurate in the overall RMSE. However, the second-order had the best amplitude preservation. It was found that for the AM-2 non-Boussinesq solution, upward IGW modes were unstable, and downward IGW modes were stable regarding temporal numerical modification after one time step. Additionally, the AM-3 scheme was unconditionally slightly unstable which indicates that if the spatial truncation due to interpolation does not dampen the solution enough, some smoothing (or perhaps uncentering) may be needed.

The methods of uncentering were also analyzed as presented in Nair and Tufo (2007). The first-order uncentering method is based off a perturbation of the AM-2 scheme such that the future time step forcing term gains more weighting and the current time step forcing term loses some weighting (speaking from a backward trajectory methodology). Also, if one were to set the second-order uncentering coefficient to εs = 1/6, the AM-3 scheme is identically − obtained which suggests that the non-zero εs may in fact be non-resonant regarding the oro- graphic resonance problem treated by uncentering. Nair and Tufo (2007) only mentions that

εs 0 ensures stable modes for all four gravity wave components in time; however, no specific ≥ mention is made about negative uncentering parameter values, and no known implementations have tested this to the author’s knowledge. Therefore, it would be beneficial to study this in the future to see if a full third-order resonance-free SISL solution can be obtained with εs = 1/6 − (the AM-3 scheme). The AM-3 combined scheme (combining physical and computational harmonic modes in

the solution) turns out to be more accurate than any of the other methods in this study for larger time steps (greater than 40 seconds). Whether or not the AM-3 scheme would need uncentering is unknown and requires future research as such uncentering would likely require more than three forcing terms in the SISL discretization. How to interpret such results would also be a

200 topic of future research as more than four modes would be obtained and could no longer be separated into two physical and two computational modes of solution.

201 Bibliography

Artebrant, R. and J. H. Schroll, 2005: Conservative logarithmic reconstructions and finite vol- ume methods. SIAM Journal on Scientific Computing, 27 (1), 294–314.

Artebrant, R. and J. H. Schroll, 2006: Limiter-free third order logarithmic reconstruction. SIAM Journal on Scientific Computing, 28 (1), 359–381.

Bates, J. R., S. Moorthi, and R. W. Higgins, 1993: A globalmultilevel atmospheric model using a vector semi-lagrangian finite difference scheme. part i: Adiabatic formulation. Monthly

Weather Review, 121, 244–263.

Bates, J. R., F. H. M. Semazzi, R. W. Higgins, and S. R. M. Barros, 1990: Integration of the shallow water equations on the sphere using a vector semi-lagrangian scheme with a multigrid solver. Monthly Weather Review, 118 (8), 1615–1627.

Benoita, R., M. Desgagna, P. Pellerina, S. Pellerina, Y. Chartiera, and S. Desjardinsb, 1997: The canadian mc2: A semi-lagrangian, semi-implicit wideband atmospheric model suited for finescale process studies and simulation. Monthly Weather Review, 125 (10), 2382–2415.

Carpenter, R. L., K. K. Droegemeier, P. R. Woodward, and C. E. Hane, 1990: Application of the

piecewise parabolic method (ppm) to meteorological modeling. Monthly Weather Review, 118, 586–612.

202 Chlond, A., 1994: Locally modified version of bott’s advection scheme. Monthly Weather Review, 122 (1), 111–125.

Colella, P. and P. R. Woodward, 1984: The piecewise parabolic method (ppm) for gas- dynamical simulations. Journal of Computational Physics, 54, 174–201.

Cote, J., A. Methot, A. Patoine, M. Roch, and A. Staniforth, 1998: The operational cmc/mrb global environmental multiscale (gem) model. Monthly Weather Review, 126, 1373–1395.

Courant, R., K. Friedrichs, and H. Lewy, 1928: Uber die partiellen differenzengleichungen der mathematischen physik. Mathematische Annalen, 100 (1), 32–74.

Eliassen, A. and E. Raustein, 1968: A numerical integration experiment with a model atmo- sphere based on isentropic coordinates. Meteorologiske Annaler, 5, 45–63.

Fitzgerald, W., D. Lemire, and M. Brooks, 2005: Quasi-monotonic segmentation of state vari- able behaviour for reactive control. Proceedings of the Twentieth National Conference on

Artificial Intelligence, National Research Council of Canada.

Fritsch, F. N. and R. E. Carlson, 1980: Monotone piecewise cubic interpolation. SIAM Journal on Numerical Analysis, 17 (2), 238–246.

Gates, W. L., E. S. Battern, A. B. Kahle, and A. B. Nelson, 1971: A documentation of the mintz-arakawa two-level atmospheric general circulation model. Tech. rep.

Gill, A. E., 1980: Some simple solutions for heat-induced tropical circulation. Quarterly Jour- nal of the Royal Meteorological Society, 106, 447–462.

Goldstein, S., 1931: On the stability of superposed streams of fluids of different densities. Proc. R. Soc. London A, Vol. 132, 524–548.

203 Gordon, C. T. and W. F. Stern, 1982: A description of the gfdl global spectral model. Monthly Weather Review, 110 (7), 625–644.

Gravel, S. and A. Staniforth, 1994: A mass-conserving semi-lagrangian scheme for shallow- water equations. Monthly Weather Review, 122, 243–248.

Harten, A., 1997: High resolution schemes for hyperbolic conservation laws. J. Comput. Phys., 135 (2), 260–278.

Harten, A., B. Engquist, S. Osher, and S. R. Chakravarthy, 1987: Uniformly high order accurate essentially non-oscillatory schemes, 111. J. Comput. Phys., 71 (2), 231–303.

Holloway, J. L. and S. Manabe, 1971: Simulation of climate by a global general circulation model. Monthly Weather Review, 99 (5), 335–370.

Holton, J. R., 2004: An Introduction to Dynamic Meteorology, International Geophysics Series, Vol. 88. Fourth edition ed., Elsevier Academic Press.

Jockel, P., R. von Kuhlmann, M. G. Lawrence, B. Steil, C. A. M. Brenninkmeijer, P. J. Crutzen, P. J. Rasch, and B. Eaton, 2001: On a fundamental problem in implementing flux-form advection schemes for tracer transport in 3-dimensional general circulation and chemistry transport models. Quarterly Journal of the Royal Meteorological Society, 127, 1035–1052.

Laprise, J. P. R. and A. Plante, 1995: A class of semi-lagrangian integrated-mass (slm) numer-

ical transport algorithms. Monthly Weather Review, 123, 553–565.

Lauritzen, P. H., 2005: An inherently mass-conserving semi-implicit semi-lagrangian model. Ph.D. thesis, University of Copenhagen.

Lauritzen, P. H., 2007: A stability analysis of finite-volume advection schemes permitting long

time steps. Monthly Weather Review, 135 (7), 2658–2673.

204 Lauritzen, P. H., E. Kaas, and B. Machenhauer, 2006a: A mass-conservative semi-implicit semi-lagrangian limited-area shallow-water model on the sphere. Monthly Weather Review, 134 (4), 1205–1221.

Lauritzen, P. H., K. Lindberg, E. Kaas, and B. Machenhauer, 2006b: A locally mass-

conservative version of the semi-implicit semi-lagrangian hirlam. i: Forecast model.

Lauritzen, P. H. and R. D. Nair, 2007: Monotone and conservative cascade remapping between spherical grids (cars): Regular latitude-longitude and cubed-sphere grids. Monthly Weather Review, in press.

Leslie, L. M. and R. J. Purser, 1995: Three dimensional mass-conserving semi-lagrangian schemes employing forward trajectories. Monthly Weather Review, 123, 2551–2566.

Leveque, R. J., 2002: Multidimensional Scalar Equations, chap. 20. Cambridge University Press.

Lin, S. J., 2004: A “vertically lagrangian” finite-volume dynamical core for global models. Monthly Weather Review, 132, 2293–2307.

Lin, S. J. and R. B. Rood, 1996: Multidimensional flux-form semi-lagrangian transport schemes. Monthly Weather Review, 124 (9), 2046–2070.

Machenhauer, B. and M. Olk, 1996: On the development of a semi-lagrangian cell integrated

shallow water model on the sphere. Proc. ECMWF Workshop on Semi-Lagrangian Methods, ECMWF.

Machenhaur, B. E., E. Kaas, and P. H. Lauritzen, 2007: Finite Volume Methods in Meteorology.

Marquina, A., 1994: Local piecewise hyperbolic reconstruction of numerical fluxes for nonlin-

ear scalar conservation laws. SIAM Journal on Scientific Computing, 15 (4), 892–915.

205 Mcdonald, A., 1987: Accuracy of multiply-upstream semi-lagrangian advective schemes ii. Monthly Weather Review, 115 (7), 1446–1450.

Nair, R., J. Cote, and A. Staniforth, 1999: Monotonic cascade interpolation for semi-lagrangian advection. Quarterly Journal of the Royal Meteorological Society, 125, 197–212.

Nair, R. D., 2004: Extension of a conservative cascade scheme on the sphere to large courant numbers. Monthly Weather Review, 132 (1), 390–395.

Nair, R. D. and C. Jablonowski,2007: Moving vortices on the sphere: A test case for horizontal advection problems.

Nair, R. D. and B. Machenhauer, 2002: The mass-conservative cell-integrated semi-lagrangian advection scheme on the sphere. Monthly Weather Review, 130, 649–667.

Nair, R. D., J. S. Scroggs, and F. H. M. Semazzi, 2002: Efficient conservative global transport schemes for climate and atmospheric chemistry models. Monthly Weather Review, 130 (8),

2059–2073.

Nair, R. D., J. S. Scroggs, and F. H. M. Semazzi, 2003: A forward-trajectory global semi- lagrangian transport scheme. J. Comput. Phys., 190 (1), 275–294.

Nair, R. D. and H. M. Tufo, 2007: Petascale atmospheric general circulation models. Journal of Physics: Conference Series, 78.

Priestley, A., 1993: A quasi-conservative version of the semi-lagrangian advection scheme. Monthly Weather Review, 121, 621–629.

Purser, R. J. and L. M. Leslie, 1991: An efficient interpolation procedure for high-order three- dimensional semi-lagrangian models. Monthly Weather Review, 119 (10), 2492–2498.

206 Qian, J.-H. and A. Kasahara, 2003: Nonhydrostatic atmospheric normal modes on beta-planes. Pure and Applied Geophysics, 160, 1315–1358.

Rancic, M., 1992: Semi-lagrangian piecewise biparabolic scheme for two-dimensional hori- zontal advection of a passive scalar. Monthly Weather Review, 120, 1394–1406.

Rancic, M., 1995: Efficient, conservative, monotone remapping for semi-lagrangian transport algortithms. Monthly Weather Review, 123, 1213–1217.

Ritchie, H., C. Temperton, A. Simmons, M. Hortal, T. Davies, D. Dent, and M. Hamrud, 1995: Implementation of the semi-lagrangian method in a high-resolution version of the ecmwf

forecast model. Monthly Weather Review, 123 (2).

Robert, A., 1981: A stable numerical integration scheme for the primitive meteorological equa- tions. Atmosphere Ocean, 19, 35–46.

Robert, A., 1982: A semi-lagrangian and semi-implicit numerical integration scheme for the

primitive meteorological equations. Journal of the Meteorological Society of Japan, 60, 319– 325.

Robert, A., J. Henderson, and C. Turnbull, 1972: An implicit time integration scheme for baroclinic models of the atmosphere. Monthly Weather Review, 100 (5).

Rood, R., 1987: Numerical advection algorithms and their role in atmospheric transport and

chemistry models. Reviews of Geophysics, 25, 71–100.

Royer, J. F., 1986: Correction of negative mixing ratios in spectral models by global horizontal borrowing. Monthly Weather Review, 114 (7), 1406–1410.

Semazzi, F. H. M., J. S. Scroggs, G. A. Pouliot, A. L. Mckee-Burrows, M. Norman, V. Poojary,

and Y.-M. Tsai, 2005: On the accuracy of semi-lagrangian numerical simulation of internal

207 gravity wave motion in the atmosphere. Journal of the Meteorological Society of Japan, 83 (5), 851–869.

Serna, S., 2006: A class of extended limiters applied to piecewise hyperbolic methods. SIAM Journal on Scientific Computing, 28 (1), 123–140.

Smolarkiewicz, P. K. and J. A. Pudykiewicz, 1992: A class of semi-lagrangian approximations for fluids. Journal of the Atmospheric Sciences, 49 (22), 2082–2096.

Staniforth, A. and J. Cote, 1991: Semi-lagrangian integration schemes for atmospheric models: A review. Monthly Weather Review, 119 (9), 2206–2223.

Tanguay, M., A. Robert, and R. Laprise, 1990: A semi-implicit send-lagrangian fully com- pressible regional forecast model. Monthly Weather Review, 118 (10).

Taylor, G. I., 1931: Effect of variation in density on the stability of superposed streams of fluid. Proc. R. Soc. London A, 499–523.

Temperton, C. and A. Staniforth, 1987: An efficient two-time-level semi-lagrangian semi- implicit integration scheme. Quarterly Journal of the Royal Meteorological Society, 113, 1025–1039.

Williamson, D. L., 1983: Description of the ncar community climate model (ccmob). Tech. rep.

Willmott, C. J. and K. Matsuura, 2005: Advantages of the mean absolute error (mae) over the mean square error (rmse) in assessing average model performance. Climate Research, 30,

79–82.

Willmott,C. J. and K. Matsuura, 2006: On the use of dimensioned measures of error to evaluate the performance of spatial interpolators. International Journal of Geographical Information Science, 20, 89–102.

208 Xiao, F., T. Yabe, X. Peng, and H. Kobayashi, 2002: Conservative and oscillation-less at- mospheric transport schemes based on rational functions. Journal of Geophysical Research (Atmospheres), 107, 2–1.

Zerroukat, M., N. Wood, and A. Staniforth, 2002: Slice: A semi-lagrangian inherently con-

serving and efficient scheme for transport problems. Quarterly Journal of the Royal Meteo- rological Society, 128, 2801–2820.

Zerroukat, M., N. Wood, and A. Staniforth, 2004: Slice-s: A semi-lagrangian inherently con- serving and efficient scheme for transport problems on the sphere. Quarterly Journal of the

Royal Meteorological Society, 130, 2649–2664.

Zerroukat, M., N. Wood, and A. Staniforth, 2005: A monotonic and positive-definite filter for a semi-lagrangian inherently conserving and efficient (slice) scheme. Quarterly Journal of the Royal Meteorological Society, 131, 2923–2936.

Zerroukat, M., N. Wood, and A. Staniforth, 2006: The parabolic spline method (psm) for

conservative transport problems. International Journal for Numerical Methods in Fluids, 51 (11), 1297–1318.

Zerroukat, M., N. Wood, and A. Staniforth, 2007: Application of the parabolic spline method (psm) to a multi-dimensional conservative semi-lagrangian transport scheme (slice). Journal

of Computational Physics, In press.

209