The Formal Generation of Models for Scientific Simulations 1

The formal generation of models for scientific simulations 1 Daniel Tang 2 September, 2010 1Submitted in accordance with the requirements for the degree of Phd 2University of Leeds, School of Earth and Environment The candidate confirms that the work submitted is his own and that appropriate credit had been given where reference has been made to the work of others. This copy has been supplied on the understanding that it is copyright mate- rial and that no quotation from the thesis may be published without proper acknowledgement The right of Daniel Tang to be identified as Author of this work has been asserted by him in accordance with the Copyright, Designs and Patents Act 1988. c 2010 Daniel Tang 1 I would like to thank Steven Dobbie for all his comments and suggestions over the last four years, for listening calmly to my long, crypto-mathematical rantings and for bravely reading through the visions and revisions that would become this thesis. I would also like to thank Nik Stott for providing much needed moti- vation and for his comments on the more important parts of this thesis. Thanks also to Jonathan Chrimes for supplying figures from the DYCOMS-II intercomparison study, and to Wayne and Jane at Northern Tea Power for supplying the necessary coffee. Thanks go to Zen Internet for the loan of a computer for the stratocumulus experiment and to the Natural Environment Research Council for funding this research under award number NER/S/A/2006/14148. 2 Contents Preface 1 1 Introduction 2 1.1 Posing the question . 7 1.1.1 The conventional view of parameterisation . 7 1.1.2 Re-posing the question . 8 1.2 Mathematical addenda . 11 2 Splitting models into modules 17 2.1 Abstraction theory: An informal introduction . 18 2.2 Splitting programs into modules . 21 2.3 Abstraction theory: Mathematical development . 22 2.3.1 Abstraction of composite functions . 26 2.3.2 Abstraction of compound dynamic systems . 27 2.3.3 Abstraction of Diagnostic variables . 29 2.3.4 Dynamic systems that have no abstraction . 29 2.3.5 Concrete and abstract attractors . 33 2.4 Related work . 36 2.5 Conclusion . 37 3 Approximation of computer programs 38 3 3.1 Static analysis of computer programs . 38 3.1.1 Approximating arithmetic assignments . 39 3.1.2 Random numbers . 41 3.1.3 Fixed loops . 42 3.1.4 if statements . 45 3.1.5 Conditional loops and Rice’s Theorem . 47 3.1.6 Arrays . 48 3.1.7 Related work . 49 3.2 A formal, symbolic semantics of computer programs . 51 3.2.1 Formal syntactic analysis . 51 3.2.2 Formal semantics . 53 3.2.3 A symbolic semantics of computer programs . 54 3.3 Random numbers . 57 3.4 Conclusion . 58 4 Experiments with Chebyshev Polynomials 59 4.1 Introduction . 59 4.2 Approximate algebra with Chebyshev bounds . 61 4.2.1 Addition, Subtraction and multiplication . 61 4.2.2 Division . 62 4.3 Transcendental functions . 64 4.3.1 Non-integer powers of polynomials . 64 4.3.2 Exponentiation of polynomials . 64 4.3.3 Sine and cosine . 65 4.4 Random numbers . 65 4.5 Lorenz equations . 67 4.6 Ideal Gas . 69 4.6.1 Reasoning with Heaviside functions . 69 4 4.7 Rayleigh-Benard convection . 75 4.8 Mie scattering . 76 5 Experiments with DeSelby Polynomials 79 5.1 Introduction . 79 5.2 DeSelby polynomials . 80 5.2.1 Computing with DeSelby polynomials . 82 5.3 DeSelby bounds . 83 5.3.1 Bounding DeSelby polynomials . 84 5.3.2 Addition/Subtraction of DeSelby polynomials . 84 5.3.3 Multiplication of DeSelby polynomials . 84 5.4 Testing with a Cloud Resolving Model . 85 5.5 Mathematical development . 86 5.6 Algorithms . 89 5.6.1 O(N) Gauss-Lobatto/DeSelby conversion . 89 5.6.2 O(N) differentiation . 91 5.6.3 DeSelby/Gauss-Lobatto to Chebyshev conversion . 92 5.6.4 Bounding a polynomial . 93 5.6.5 Bounding a polynomial (alternative method) . 94 6 Entrainment in marine stratocumulus 97 6.1 Introduction . 97 6.2 A Cloud Resolving Model for stratocumulus . 98 6.2.1 Testing the model against observation . 99 6.2.2 Wrapping the CRM to calculate entrainment . 100 6.3 The ill conditioning of entrainment . 107 6.4 Analysing entrainment . 112 6.4.1 Sensitivity of the wrapped model to domain geometry . 113 5 6.4.2 Analysis . 115 6.4.3 Results . 115 6.5 Conclusion . 117 7 Further work and Conclusion 119 A Mie Theory 123 B Cloud Resolving Model Equations 125 B.1 Symbols . 125 B.1.1 Prognostic variables . 125 B.1.2 External parameters . 125 B.1.3 Constants . 126 B.1.4 Diagnostic variables . 126 B.2 Equilibrium state . 126 B.3 Diagnostic equations . 127 B.4 Prognostic equations . 128 B.4.1 Turbulence fluxes . 129 B.4.2 Slow (non-acoustic) waves . 129 B.4.3 Radiative fluxes . 129 B.4.4 Surface fluxes . 129 B.5 Numerical implementation . 130 B.6 Numerical treatment of timesplitting . 130 B.7 Boundary conditions . 130 C Polynomials for entrainment 132 C.1 Mean entrainment . 132 C.2 Standard deviation . 144 6 List of Figures 2.1 Graphical representation of the relation between a low resolution model, F , and a high resolution model f. Ψ and Φ are high resolution states, while X and Y are low resolution states. γ is the extra program that converts low to high resolution states. 19 4.1 Program to integrate the Lorenz equations . 68 4.2 Program to simulate an atom bouncing around a 2-dimensional box................................... 71 4.3 Wrapper for the atomicTheory simulation . 72 4.4 Plot of scattering cross section for fixed radius droplets (solid line), numerically integrated over the droplet radius distribution (dashes), and iGen’s simplified model (dots). For clarity, a smaller portion is reproduced in the lower plot. 78 5.1 The first 9 DeSelby basis functions, grouped into shells. The first shell is at the bottom, the fourth at the top. The arrow shows the points at which one basis function has a value 1 and all functions of higher degree have a value 0 . 81 6.1 Cloudtop height of DYCOMS-II simulation: Solid line shows results from our 2D CRM. The inner error bars show the first and third quartiles of the ensemble of models in the Stevens et.al. (2005) intercomparison, the outer error bars show the maximum and minimum values of the ensemble. The mid-points of the error bars are marked by crosses and plus signs respectively. 101 7 6.2 DYCOMS-II simulation: Cloudtop height from two hours into the simulation. The solid line shows the results from the 2D CRM. Inner error bars show the first and third quartiles of the ensemble of intercomparison models, the outer error bars shows the maximum and minimum values of the ensemble. 102 6.3 DYCOMS-II simulation: Cloudbase height from two hours into the simulation. The solid line shows the results from the 2D CRM. Inner error bars show the first and third quartiles of the ensemble of intercomparison models, the outer error bars shows the maximum and minimum values of the ensemble. 103 6.4 Total entrainment against simulated time for six simulations dif- fering only in a 0.0025K perturbation to the initial conditions. 108 6.5 Histogram showing the amount of entrainment in 12 seconds. Sampled over a 15 hour simulation. 110 6.6 Six random walk simulations. Each step spanned 80 seconds and had a Gaussian distribution. Units were chosen to give a walk of the same magnitude as the entrainment shown in figure 6.4 . 111 6.7 Entrainment of the CRM for different geometries and different boundary layer temperature. 114 8 List of Tables 4.1 The three wrapped concrete functions of the Lorenz equations . 70 4.2 The execution speed and accuracy of the parameterisations . 70 6.1 The least squares fit of the rate of entrainment for different domain geometries. 114 6.2 The ranges of large scale boundary layer values found in various published sources. 116 9 Abstract It is now commonplace for complex physical systems such as the climate system to be studied indirectly via computer simulations. Often, the equations that govern the underlying physical system are known but detailed or high- resolution computer models of these equations (“governing models”) are not practical because of limited computational resources; so the models are simplified or “parameterised”. However, if the output of a simplified model is to lead to conclusions about a physical system, we must prove that these outputs reflect reality and are not merely artifacts of the simplifications. At present, simplifications are usually based on informal, ad-hoc methods making it diffi- cult or impossible to provide such a proof rigorously. Here we introduce a set of formal methods for generating computer models. We present a newly developed computer program, “iGen”, which syntactically analyses the computer code of a high-resolution, governing model and, without executing it, automatically produces a much faster, simplified model with provable bounds on error compared to the governing model. These bounds allow scientists to rigorously distinguish real world phenomena from artifact in subsequent numerical experiments using the simplified model. Using simple physical systems as examples, we illustrate that iGen produces simplified models that execute typically orders of magnitude faster than their governing models. Finally, iGen is used to generate a model of entrainment in marine stratocumulus. The resulting simplified model is appropriate for use as part of a parameterisation of marine stratocumulus in a Global Climate Model.

The Formal Generation of Models for Scientific Simulations 1

Autocoding Methods for Networked Embedded Systems

Basically Speaking

High Level Preprocessor of a VHDL-Based Design System

If Statement in While Loop Javascript

50 | Control Flow

The While Loop Purpose: to Learn How to Use a Conditional Loop

The Increment and Decrement Operators

Basic Programming Skills/ Foundations of Computer Programming DCAP102/DCAP401 Editor Dr

Conditional Statement in C Language

Control Structure: Repetition - Part 2 01204111 Computers and Programming

Chapter 2 the Imperative Programming Languages, C/C++

Introduction to the C Programming Language