UCAM-CL-TR-730 Technical Report ISSN 1476-2986

Number 730

Computer Laboratory

Adaptive evaluation of non-strict programs

Robert J. Ennals

August 2008

15 JJ Thomson Avenue Cambridge CB3 0FD United Kingdom phone +44 1223 763500 http://www.cl.cam.ac.uk/ c 2008 Robert J. Ennals

This technical report is based on a dissertation submitted June 2004 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Kings College.

Technical reports published by the University of Cambridge Computer Laboratory are freely available via the Internet:

http://www.cl.cam.ac.uk/techreports/

ISSN 1476-2986 Abstract

Most popular programming languages are strict. In a strict language, the binding of a variable to an expression coincides with the evaluation of the expression. Non-strict languages attempt to make life easier for programmers by decoupling expression binding and expression evaluation. In a non-strict language, a variable can be bound to an unevaluated expression, and such expressions can be passed around just like values in a strict language. This separation allows the programmer to declare a variable at the point that makes most logical sense, rather than at the point at which its value is known to be needed. Non-strict languages are usually evaluated using a technique called . Lazy Evaluation will only evaluate an expression when its value is known to be needed. While Lazy Evaluation minimises the total number of expressions evaluated, it imposes a considerable bookkeeping overhead, and has unpredictable space behaviour. In this thesis, we present a new which we call Optimistic Evaluation. Optimistic Evaluation blends lazy and eager evaluation under the guidance of an online pro- filer. The online profiler observes the running program and decides which expressions should be evaluated lazily, and which should be evaluated eagerly. We show that the worst case per- formance of Optimistic Evaluation relative to Lazy Evaluation can be bounded with an upper bound chosen by the user. Increasing this upper bound allows the profiler to take greater risks and potentially achieve better average performance. This thesis describes both the theory and practice of Optimistic Evaluation. We start by giving an overview of Optimistic Evaluation. We go on to present formal model, which we use to justify our design. We then detail how we have implemented Optimistic Evaluation as part of an industrial-strength compiler. Finally, we provide experimental results to back up our claims.

3 Acknowledgments

First of all, I would like to thank my supervisor, Simon Peyton Jones, who has provided in- valuable support, inspiration, and advice. Enormous thanks must go to Alan Mycroft, who has acted as my lab supervisor. I am also extremely grateful for the support of Microsoft Research, who have provided me not only with funding and equipment, but also with a supervisor, and a building full of interesting people. I would also like to thank Jan-Willem Maessen and Dave Scott, who provided suggestions on this thesis, and Manuel Chakravarty, Karl-Filip Fax´en, Fergus Henderson, Neil Johnson, Alan Lawrence, Simon Marlow, Greg Morisett, Nick Nethercote, Nenrik Nilsson, Andy Pitts, Norman Ramsey, John Reppy, Richard Sharp, and Jeremy Singer, who have provided useful suggestions during the course of this work. This work builds on top of years of work that many people have put into the Glasgow Haskell Compiler. I would like to thank all the people at Glasgow, at Microsoft, and elsewhere, who have worked on GHC over the years. Special thanks must go to Simon Marlow, who was always able to help me when I ran into deep dark corners in the GHC runtime system.

4 Contents

1 Introduction 9 1.1 LazyEvaluation...... 9 1.2 MixingEagerandLazyEvaluation ...... 10 1.3 OptimisticEvaluation...... 12 1.4 StructureofthisThesis ...... 13 1.5 Contributions ...... 13

I Overview 14

2 Optimistic Evaluation 15 2.1 SwitchableLetExpressions...... 16 2.2 Abortion...... 17 2.3 ChunkyEvaluation ...... 17 2.4 ProbabilityofaNestedSpeculationbeingUsed ...... 19 2.5 ExceptionsandErrors...... 20 2.6 UnsafeInput/Output ...... 20

3 Online Profiling 22 3.1 TheNeedforProfiling ...... 22 3.2 Goodness ...... 23 3.3 CalculatingWastedWork ...... 24 3.4 BurstProfiling...... 28

5 6 CONTENTS II Theory 30

4 Semantics 31 4.1 ASimpleLanguage...... 32 4.2 OperationalSemantics ...... 33 4.3 DenotationalSemantics...... 40 4.4 Soundness...... 46

5 A Cost Model for Non-Strict Evaluation 48 5.1 WhyWeNeedaCostModel ...... 49 5.2 CostGraphs...... 51 5.3 ProducingaCostViewforaProgram ...... 60 5.4 RulesforEvaluation ...... 62 5.5 AnOperationalSemantics ...... 66 5.6 DenotationalMeaningsofOperationalStates ...... 71

6 Deriving an Online Profiler 75 6.1 CategorisingWork ...... 76 6.2 Blame...... 80 6.3 BoundingWorstCasePerformance...... 86 6.4 BurstProfiling...... 89

III Implementation 95

7 The GHC Execution Model 96 7.1 TheHeap ...... 97 7.2 TheStack ...... 100 7.3 EvaluatingExpressions ...... 103 7.4 ImplementingtheEvaluationRules...... 105 7.5 OtherDetails ...... 112

8 Switchable Let Expressions 113 8.1 SpeculativeEvaluationofaLet...... 114 8.2 FlatSpeculation...... 119 8.3 Semi-Tagging...... 122 8.4 ProblemswithLazyBlackholing ...... 125 CONTENTS 7

9 Online Profiling 128 9.1 RuntimeStateforProfiling ...... 128 9.2 ImplementingtheEvaluationRules...... 132 9.3 HeapProfiling...... 137 9.4 FurtherDetails ...... 139

10 Abortion 143 10.1WhentoAbort ...... 143 10.2HowtoAbort ...... 145

11 Debugging 149 11.1 HowtheDarkSideDoIt ...... 150 11.2 FailingtoDebugLazyPrograms ...... 151 11.3 EliminatingTailCallElimination...... 151 11.4 OptimisticEvaluation...... 152 11.5HsDebug ...... 153

IV Conclusions 154

12 Results 155 12.1 HowtheTestswereCarriedOut ...... 156 12.2Performance...... 157 12.3Profiling...... 159 12.4Metrics ...... 167 12.5Semi-Tagging ...... 172 12.6HeapUsage ...... 173 12.7Miscellanea ...... 176

13 Related Work 181 13.1 StaticHybridStrategies...... 182 13.2EagerHaskell ...... 187 13.3 Speculative Evaluation for Multiprocessor Parallelism...... 189 13.4 SpeculationforUniprocessorParallelism ...... 191 13.5ModelsofCost ...... 193 13.6Profiling...... 196 13.7 DebuggersforNon-StrictLanguages ...... 197 13.8 ReducingSpaceUsage ...... 199 13.9 OtherApproachestoEvaluation ...... 200 8 CONTENTS

14 Conclusions 202

A Proof that Meaning is Preserved 205 A.1 EvaluationPreservesMeaning ...... 205 A.2 AbortionPreservesMeaning ...... 208

B Proof that Costed Meaning is Preserved 211 B.1 Lemmas...... 211 B.2 ProofofSoundnessforEvaluationRules ...... 212 B.3 ProofofSoundnessforAbortionRules...... 214

C Proof that programWork predicts workDone 216 C.1 Preliminaries ...... 216 C.2 ProofforEvaluationTransitions ...... 217 C.3 ProofforAbortionTransitions ...... 220

D Example HsDebug Debugging Logs 221 CHAPTER 1

Introduction

Avoiding work is sometimes a waste of time.

Non-strict languages are elegant, but they are also slow. Optimistic Evaluation is an imple- mentation technique that makes them faster. It does this by using a mixture of Lazy Evaluation and Eager Evaluation, with the exact blend decided by an online profiler.

1.1 Lazy Evaluation

Lazy Evaluation is like leaving the washing-up until later. If you are never going to use your plates again then you can save considerable time by not washing them up. In the extreme case you may even avoid washing up a plate that is so dirty that it would have taken an infinite amount of time to get clean. However, if you are going to need to use your plates again, then leaving the washing up until later will waste time. By the time you eventually need to use your plates, the food on them will have stuck fast to the plate and will take longer to wash off. You will also have to make space in your kitchen to hold all the washing up that you have not yet done.

Non-strict languages aim to make life easier for programmers by removing the need for a programmer to decide if and when an expression should be evaluated. If a programmer were to write the following: let x = E in E0

9 10 Chapter 1. Introduction then a strict language such as C [KR88] or ML [MTHM97] would evaluate the let expression eagerly. It would evaluate the E to a value, bind x to this value, and then evaluate E0. By contrast, a non-strict language such as Haskell [PHA+99] or Clean [BvEvLP87, NSvEP91] will not require that E be evaluated until x is known to be needed. Non-strict languages are usu- ally evaluated using an evaluation strategy called Lazy Evaluation.1 If the evaluator evaluates the letlazily then it will bind x to a description of how to evaluate E, only evaluating E if x is found to be needed. In a strict language, the let declaration does not just say what x means—it also says that x should be evaluated now. This forces the programmer to declare x at a point at which it is known that E is needed; otherwise the work done to evaluate E might be wasted. However, the best point for x to be evaluated might not be the most logical place for x to be declared. Pro- grammers are forced to make a compromise between elegance and efficiency in their placement of declarations. Non-strict languages avoid this problem by distinguishing between the point at which a vari- able is declared and the point at which it is evaluated. It is argued that this gives programmers more freedom, and allows them to write more elegant programs than they would in a strict language [Hug89]. Unfortunately, this beauty comes at a cost. If the evaluator is not going to evaluate an expression immediately, then it must create a data structure describing how to produce the value later. Producing this data structure takes time and space. As a result of this, non-strict languages are usually significantly slower than strict languages.

1.2 Mixing Eager and Lazy Evaluation

Rather than leaving all the washing up until later, or washing up everything now, it might be best to wash up some plates now, but leave some other plates for later. I might decide to wash up the plates that I know I am going to need tomorrow. Alternatively, I might wash up only those plates which I know are fairly clean and so won’t take very long to wash up.

Unfortunately, it can be difficult to know for certain that a plate will definitely be needed or that it will definitely be easy to clean. I may have used my favourite plate every day this year, but if I got hit by a bus tomorrow then I would never need to use it again. Similarly, although my plate has always been easy to wash up in the past, I can’t be sure that it won’t be impossible to wash up this time.

1The only non-lazy implementation of a non-strict language of which we are aware is Eager Haskell, which we discuss in Section 13.2. 1.2. Mixing Eager and Lazy Evaluation 11

Figure 1.1: Percentage of lazy let evaluations left behind by GHC’s strictness analyser that we believe should be evaluated eagerly

One way to for an implementation to avoid the cost of Lazy Evaluation is to simply not use it. If E is very cheap to evaluate, or is always needed, then it is safe to evaluate x at the point at which it is declared. We can thus arrange to use Eager Evaluation in such cases, and save Lazy Evaluation for the cases where it necessary. It is all very well to say “we will only use Lazy Evaluation when it necessary”, but how does a language implementation find out where Lazy Evaluation can be safely replaced by Eager Evaluation? Most existing compilers for non-strict languages use a static analysis called strictness anal- ysis (Section 13.1.2) to find expressions that are guaranteed to be needed.2 Another approach is cheapness analysis (Section 13.1.3) which finds expressions that are guaranteed to be cheap to evaluate. While both these analyses work well they, like all undecidable static analyses,