Programming with Numerical Uncertainties

Programming with Numerical Uncertainties THÈSE NO 6343 (2014) PRÉSENTÉE LE 28 NOVEMBRE 2014 À LA FACULTÉ INFORMATIQUE ET COMMUNICATIONS LABORATOIRE D'ANALYSE ET DE RAISONNEMENT AUTOMATISÉS PROGRAMME DOCTORAL EN INFORMATIQUE ET COMMUNICATIONS ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE POUR L'OBTENTION DU GRADE DE DOCTEUR ÈS SCIENCES PAR Eva DARULOVÁ acceptée sur proposition du jury: Prof. M. Odersky, président du jury Prof. V. Kuncak, directeur de thèse Prof. R. Bodik, rapporteur Prof. C. Koch, rapporteur Dr R. Majumdar, rapporteur Suisse 2014 Dˇedeˇckovi Acknowledgements I would like to thank Viktor Kuncak for his guidance during my five years at EPFL. His wealth of knowledge, enthusiasm and perseverance are amazing and and have been very motivating. I have much appreciated the independence he gave me in pursuing my own research topic, while relentlessly pushing me in areas that do not come to me quite as easily. His attention to detail, while making paper writing a roller-coaster experience, was highly instructive and made the most of our work. I would further like to thank my thesis committee members Rupak Majumdar, Rastislav Bodik, Christoph Koch and Martin Odersky for their time and many valuable comments. There are many people without whom the last five years would have been significantly less fun. I thank all past and present LARA lab member who contributed in one way or another to making my stay at EPFL a pleasant experience: Ruzica, Philippe, Hossein, Giuliano, Tihomir, Etienne, Swen, Andrej, Filip, Pierre-Emmanuel, Ivan, Regis, Mikael, Ravi and Manos. Philippe and Etienne in particular deserve a medal for helping me deal with the intricacies of version control and misbehaving installs in general. Fabien Salvi showed equally great patience when I managed to break my computer(s) in yet another creative way. Danielle Chamberlain and Yvette Gallay were very helpful when it came to paper work and obscure things like sending a fax. The LAMP group and especially Manohar, Nada, Vojin, Alex, Hubert and Eugene deserve a big thank you for being great discussion and coffee drinking buddies as well as world-class ‘tech-support’ when it came to all things Scala. Furthermore, Petr and Cristina made sure I had great company during lunch times and beyond, Djano provided for many entertaining bilingual coffee breaks and Stefan was a great climbing, running and moving partner. I have also had great support from afar. I thank Patty for taking the time and being such an amazing mentor. Jenny and Felix were with me from the start and were always my link to, at times much needed, non-academic reality, even though I moved from one far-away place to another. Barry kept up the spirits with endless questions, and helped me remember the Irish way of saying ‘Ah sure, it’ll be grand!’. Sure enough, in the end it always was. It is safe to say that without my parents’ support this thesis would not have been possible. They showed me that working hard pays off, how to enjoy the small as well as the big things in life and let me try anything - whether it meant burnt pots or moving to a foreign country. And Jana made sure I was not feeling alone in Switzerland and was always there when I needed ice cream. Finally, I thank Sebastian for his unconditional support. Lausanne, 12 Mars 2011 E. D. v Preface Can computers compute? We all learn the notion of arithmetic computation in school and soon start to associate it with real numbers and operations on them. Unfortunately, our platforms including hardware, languages and environments, have no reliable support for computing with arbitrary real numbers. Numerous software packages perform impressive counts of approximate arithmetic operations per second, allowing us to understand complex phenomena in science and engineering. Moreover, embedded controllers are in charge of our cars, trains, and airplanes. Yet this essential infrastructure offers little guarantees that the computations follow real-numbered mathematical models by which they were inspired. The floating-point arithmetic standard is helpful in ensuring that individual operations are per- formed according to precision requirements. These are, however, guarantees at the machine instruction level. The main challenge remains to estimate how roundoff errors of individual instructions compose into the error on the overall algorithm. Programming languages (since FORTRAN) have made great progress in abstracting exact machine instructions into larger computation chunks that we can reason about at the source code level. Unfortunately, approximate computations do not abstract and compose as easily. It is tempting to combine errors of individual operations independently, as in interval arithmetic, but this proves to be too pessimistic. Our ability to perform so many individual operations quickly bites us back, resulting in correct, but useless, estimates. Affine arithmetic computation turns out to be an improvement, and this thesis shows its first use that is compatible with floating points. Unfortunately, it is not only that the errors accumulate, but also that they are highly dependent on the ranges of values. Any precise analysis of error propagation needs to include the analysis of ranges of values. This thesis presents analysis techniques that estimate the errors and sensitivity to errors in computations. The techniques involve much deeper reasoning about real number computations than what compilers perform today. Moreover, it proposes an ideal programming model that is much more friendly to users: programs can be written in terms of real numbers along with specifications of the desired error tolerance. The question of error bounds becomes the question of correct compilation, instead of a manual verification issue for users. Such a model also opens up optimization opportunities that rely on reasoning with real numbers. This includes fundamental properties such as associativity of addition, which does not hold for floating points, yet is needed for, e.g., parallelization of summations. The thesis shows that it is possible to automatically find equivalent algebraic expressions that improve the precision compared to the originally given expression. vii Preface Whether the implementation is derived automatically or manually, the fundamental problem of computing the worst-case error bound on the entire program remains. When we perform forward computation from initial values, we need to maintain the information about our result as precisely as possible, taking into account known algebraic properties of operations and any invariants that govern the computation process. To compute bounds automatically, it is natural to examine decision procedures for non-linear arithmetic, whose implementations within solvers such as Z3 have become increasingly usable. It is tempting to consider the entire problem of approximate computation as asking a theorem prover a series of questions. However, the automated provers use algorithms with super-exponential running times, so we must delimit the size of queries we ask, and must be prepared for a failure as an answer. In other words, these provers are only a subroutine in techniques tailored towards worst-case error estimation. For self-stabilizing computations that find, e.g., zeros of a function, we can avoid tracking dependencies across loop iterations. In such situations, we can estimate how good our solution is if we know the bounds on the derivatives of functions we are computing with. The notion of derivatives of functions turns out to be as fundamental in automatically estimating errors as it is in real analysis. Together with bounding the ranges of values and performing non-linear constraint solving, derivatives become one of the main tools used for automated error computation presented in this thesis. Derivatives are crucial, for example, to compute propagation of errors in a modular fashion, or to find closed forms of errors after any number of loop iterations. Programs handled by the techniques in this thesis need not denote continuous functions. Indeed, the widespread use of conditionals makes it very easy for software to contain disconti- nuities. Techniques introduced in the thesis can show the desired error bounds even for such programs with conditionals, where a real valued computation follows on one branch, whereas the approximate computation follows a different branch. Real numbers, as their set-theoretic definitions suggest, are inherently more involved than integers. Approximation of real number operations seems necessary for computability. With emerging hardware models, approximation also presents an opportunity for more energy- efficient solutions. To reason and compile approximate computations we need new analysis techniques of the sort presented in this thesis: techniques that estimate variable ranges by solving possibly non-linear constraints and accounting for variable dependencies, that lever- age the self-stabilizing nature of algorithms, and that use derivatives of algebraic expressions while soundly supporting conditionals. This thesis presents a comprehensive set of techniques that address these problems, from the point of view of both verification and synthesis of code. The resulting implementations are all publicly available in source code. Thanks to these techniques, thinking in terms of real numbers is becoming closer to reality, even for software developers. Lausanne, 15 October 2014 Viktor Kunˇcak viii Abstract Numerical software, common in scientific computing or embedded systems, inevitably uses an approximation

Load more