Profiling Optimised Haskell
Total Page:16
File Type:pdf, Size:1020Kb
Profiling Optimised Haskell Causal Analysis and Implementation Submitted in accordance with the requirements for the degree of Doctor of Philosophy to the University of Leeds, School of Computing June 2014 Author: Supervisor: Advisors: Peter Moritz Prof. David Duke Dr. Simon Marlow Wortmann Dr. Satnam Singh Prof. S. Peyton-Jones The candidate confirms that the work submitted is his/her own, except where work which has formed part of jointly authored publications has been included. The contri- bution of the candidate and the other authors to this work has been explicitly indicated below. The candidate confirms that appropriate credit has been given within the thesis where reference has been made to the work of others. This thesis is based on the same material as the paper “Causality of optimized Haskell: what is burning our cycles?”, by the thesis author and David Duke, published in the proceedings of the 2013 ACM SIGPLAN symposium on Haskell. Relevant material is reproduced in Chapter 3, Chapter 4 and Chapter 5, and is outside of contributions in the form of supervision and discussion entirely the author’s original work. Some parts of Chapter 5 overlap with the master thesis “Stack traces in Haskell” by Arash Rouhani from the Chalmers University of Technology. This is due to our joint work on improving the stack tracing capabilities of Haskell, which is however not integral to the contributions of this thesis. This copy has been supplied on the understanding that it is copyrighted material and that no quotation from the thesis may be published without proper acknowledgement. The right of Peter Moritz Wortmann to be identified as author of this work has been asserted by him in accordance with the Copyright, Design and Patents Act of 1988. © 2014 The University of Leeds and Peter Moritz Wortmann Acknowledgements It goes without saying that this work would not have been possible without all the people around me putting rather unwise amounts of effort into encouraging me to do crazy things. First off I have to thank David Duke and Microsoft Research for getting the ball rolling by arranging for pretty much everything I could have hoped for going into this PhD. Second I would especially like to thank my supervisor David Duke again for contributing quite a bit to making my academic life enjoyable, as well as playing an integral parts in the few productive bits of it. I always feel like he ends up reading my texts more often and more thoroughly than I ever could, which given his other commitments is an incredible feat. Furthermore, I have to credit my advisers from (formerly) Microsoft Research for always trying to push me into the right direction. And even though we probably did not manage to advance much on the project’s original goal, their feedback was always much appreciated. I have to especially thank Simon Marlow for arranging a number of meetings over the years that ended up shaping this work significantly. Finally I have to thank all the busy people that found time to read through my thesis, and discovered just how many orthographic, grammatical and semantic errors a person can hide in a document of this size. In no particular order I would like to credit Richard Senington, Dragan Šunjka, José Calderón as well as my mother Dorothee Wortmann and brother Jonas Wortmann for their tireless efforts trying to digest my heavy prose. Notes For brevity, we will use the female form throughout this thesis whenever there are multiple conceivable ways of address. This is completely arbitrary (I flipped a coin) and in no way a reflection on the role of either gender in modern society. No actual trees were harmed in the production of this PhD. Well, except for all that printing paper. My apologies. i Abstract At the present time, performance optimisation of real-life Haskell programs is a bit of a “black art”. Programmers that can do so reliably are highly esteemed, doubly so if they manage to do it without sacrificing the character of the language by falling back to an “imperative style”. The reason is that while programming at a high-level does not need to result in slow performance, it must rely on a delicate mix of optimisations and transformations to work out just right. Predicting how all these cogs will turn is hard enough – but where something goes wrong, the various transformations will have mangled the program to the point where even finding the crucial locations in the code can become a game of cat-and-mouse. In this work we will lift the veil on the performance of heavily transformed Haskell programs: Using a formal causality analysis we will track source code links from square one, and maintain the connection all the way to the final costs generated by the program. This will allow us to implement a profiling solution that can measure performance at high accuracy while explaining in detail how we got to the point in question. Furthermore, we will directly support the performance analysis process by developing an interactive profiling user interface that allows rapid theory forming and evaluation, as well as deep analysis where required. iii Contents 1 Introduction1 1.1 Problem statement..............................2 1.2 Structure...................................2 2 Background5 2.1 The Task...................................5 2.1.1 Reasoning...............................6 2.1.2 Tool Support.............................7 2.2 Verbs and Nouns...............................8 2.2.1 Verbs.................................9 2.2.2 Nouns................................. 10 2.2.3 Explanations............................. 11 2.2.4 Metrics................................ 12 2.3 Causality................................... 13 2.3.1 Context................................ 14 2.3.2 Application to Programs....................... 15 2.3.3 Alternate Worlds........................... 16 2.3.4 Minimal Change........................... 17 2.3.5 Transitivity.............................. 18 2.4 Conclusion.................................. 19 3 Haskell 21 3.1 The Language................................. 22 3.1.1 Purity................................. 22 3.1.2 Higher Order Programming..................... 23 3.1.3 Optimisation............................. 24 3.2 Objectives................................... 25 3.3 GHC Overview................................ 26 3.3.1 Core.................................. 27 3.3.2 Types................................. 28 v 3.3.3 Cmm................................. 29 3.4 Transformations Example.......................... 30 3.4.1 Rules................................. 30 3.4.2 Basic Floating............................ 31 3.4.3 Worker/Wrapper Transformation.................. 33 3.4.4 Unfolding............................... 34 3.4.5 Status Report............................. 35 3.4.6 Arity Analysis............................ 36 3.4.7 Observations............................. 39 3.4.8 Case-Of-Case............................. 40 3.5 Performance Model.............................. 41 3.5.1 Core Preparation........................... 42 3.5.2 Abstract Evaluation......................... 43 3.5.3 Registers and Stack......................... 46 3.5.4 Heap.................................. 47 3.5.5 Constructors............................. 47 3.5.6 Lambdas............................... 49 3.5.7 Applications............................. 50 3.5.8 Lets & Thunks............................ 51 3.5.9 Variables............................... 52 3.5.10 Case.................................. 53 3.5.11 Let-No-Escape............................ 54 3.5.12 Conclusion.............................. 55 4 Causality Analysis 57 4.1 Introduction.................................. 58 4.2 Events..................................... 59 4.2.1 Event Causes............................. 60 4.2.2 Cause Annotations.......................... 62 4.2.3 Annotated Judgements....................... 63 4.3 Deriving Annotations............................ 64 4.3.1 Variables............................... 65 4.3.2 Local Miracles............................ 66 4.3.3 Nested Annotations......................... 67 4.3.4 Nested Events............................ 68 4.3.5 Variable Rule............................. 70 4.4 Heap...................................... 72 4.4.1 Laziness................................ 72 4.4.2 Set-Up................................. 73 4.4.3 Proof Part 1............................. 75 4.4.4 Proof Part 2............................. 77 4.4.5 Wrapping Up............................. 79 4.5 Interrupted Rules............................... 80 4.5.1 Miraculous Interruption....................... 82 4.5.2 Application Rule........................... 83 4.6 Closest World Choice............................. 85 4.6.1 Floating Effects............................ 86 4.6.2 Floating Annotations........................ 88 4.6.3 Case Expressions........................... 89 4.6.4 One-Branch Case........................... 90 4.6.5 Skipping Scrutinisation....................... 91 4.6.6 Crash Recovery Consistency..................... 92 4.7 Causality Model............................... 94 4.7.1 Annotation Encapsulation...................... 95 4.7.2 Close Causes............................. 96 4.7.3 Close Effects............................. 97 4.7.4 Complexity.............................. 98 4.7.5 Intuition................................ 99 4.8 Optimisations................................. 100 4.8.1 Beta Reduction............................ 101 4.8.2 Push Annotations.......................... 102 4.8.3 Effects on Global Profile......................