Specialising Dynamic Techniques for Implementing the Ruby Programming Language
Total Page:16
File Type:pdf, Size:1020Kb
SPECIALISING DYNAMIC TECHNIQUES FOR IMPLEMENTING THE RUBY PROGRAMMING LANGUAGE A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Engineering and Physical Sciences 2015 By Chris Seaton School of Computer Science This published copy of the thesis contains a couple of minor typographical corrections from the version deposited in the University of Manchester Library. [email protected] chrisseaton.com/phd 2 Contents List of Listings7 List of Tables9 List of Figures 11 Abstract 15 Declaration 17 Copyright 19 Acknowledgements 21 1 Introduction 23 1.1 Dynamic Programming Languages.................. 23 1.2 Idiomatic Ruby............................ 25 1.3 Research Questions.......................... 27 1.4 Implementation Work......................... 27 1.5 Contributions............................. 28 1.6 Publications.............................. 29 1.7 Thesis Structure............................ 31 2 Characteristics of Dynamic Languages 35 2.1 Ruby.................................. 35 2.2 Ruby on Rails............................. 36 2.3 Case Study: Idiomatic Ruby..................... 37 2.4 Summary............................... 49 3 3 Implementation of Dynamic Languages 51 3.1 Foundational Techniques....................... 51 3.2 Applied Techniques.......................... 59 3.3 Implementations of Ruby....................... 65 3.4 Parallelism and Concurrency..................... 72 3.5 Summary............................... 73 4 Evaluation Methodology 75 4.1 Evaluation Philosophy and Goals.................. 75 4.2 Benchmarking Language Implementations............. 80 4.3 Statistical Techniques......................... 84 4.4 Completeness............................. 90 4.5 Benchmarks Used........................... 94 4.6 Benchmark Harnesses......................... 95 4.7 Repeatability of Results....................... 96 4.8 Summary............................... 99 5 Optimising Metaprogramming in Dynamic Languages 101 5.1 Introduction.............................. 101 5.2 Metaprogramming.......................... 102 5.3 Existing Implementations...................... 102 5.4 Dispatch Chains............................ 103 5.5 Implementation............................ 106 5.6 Application In Ruby......................... 107 5.7 Evaluation............................... 108 5.8 Summary............................... 110 6 Optimising Debugging of Dynamic Languages 115 6.1 Introduction.............................. 115 6.2 Ruby Debuggers............................ 117 6.3 A Prototype Debugger........................ 120 6.4 Debug Nodes............................. 121 6.5 Implementation............................ 128 6.6 Evaluation............................... 134 6.7 Related Work............................. 141 6.8 Summary............................... 142 4 7 Safepoints in Dynamic Language Implementation 145 7.1 Introduction.............................. 145 7.2 Safepoints............................... 148 7.3 Guest-Language Safepoints...................... 150 7.4 Applications of Safepoints in Ruby................. 152 7.5 Implementation............................ 155 7.6 Evaluation............................... 160 7.7 Summary............................... 167 8 Interpretation of Native Extensions for Dynamic Languages 169 8.1 Introduction.............................. 169 8.2 Existing Solutions........................... 171 8.3 TruffleC................................ 173 8.4 Language Interoperability on Top of Truffle............ 173 8.5 Implementation of the Ruby C API................. 180 8.6 Expressing Pointers to Managed Objects.............. 183 8.7 Memory Operations on Managed Objects.............. 184 8.8 Limitations.............................. 185 8.9 Cross-Language Optimisations.................... 185 8.10 Evaluation............................... 186 8.11 Interfacing To Native Libraries................... 191 8.12 Summary............................... 191 9 Conclusions 195 A Dataflow and Transactions for Dynamic Parallelism 197 A.1 Introduction.............................. 197 A.2 Irregular Parallelism......................... 198 A.3 Dataflow................................ 198 A.4 Transactions.............................. 199 A.5 Lee's Algorithm............................ 200 A.6 Implementation............................ 203 A.7 Analysis................................ 207 A.8 Further Work............................. 211 A.9 Summary............................... 212 5 B Benchmarks 213 B.1 Synthetic Benchmarks........................ 213 B.2 Production Benchmarks....................... 214 Bibliography 217 Word Count: 53442 6 List of Listings 2.1 Active Support's adding the sum method to the existing Enumerable class.................................. 38 2.2 Active Support's overwriting the <=> method to the existing DateTime class.................................. 38 2.3 Active Support's IncludeWithRange using a metaprogramming send 39 2.4 include? from Listing 2.3 rewritten to use conventional calls... 40 2.5 Active Support using introspection on an object before attempting a call which may fail......................... 40 2.6 Chunky PNG routine using send to choose a method based on input 41 2.7 PSD.rb forwarding methods using metaprogramming....... 42 2.8 Active Support's Duration forwarding method calls to Time ... 43 2.9 Active Support's delegate method dynamically creating a method to avoid metaprogramming (simplified)............... 44 2.10 Active Record's define_method_attribute method dynamically creating a getter for a database column (simplified)........ 44 2.11 PSD.rb's clamp in pure Ruby.................... 45 2.12 PSD Native's clamp in C using the Ruby extension API..... 45 2.13 A benchmark indicative of patterns found in PSD.rb....... 50 2.14 A sub-view of machine code resulting from Listing 2.13...... 50 5.1 JRuby's implementation of send (simplified)............. 103 5.2 Rubinius's implementation of send.................. 103 5.3 JRuby+Truffle’s implementation of dispatch chain nodes (simpli- fied)................................... 112 5.4 JRuby+Truffle’s implementation of a conventional method call (simplified)............................... 113 5.5 JRuby+Truffle’s implementation of send (simplified)........ 113 6.1 An example use of Ruby's set trace func method........ 118 7 6.2 stdlib-debug using set trace func to create a debugger (simplified)119 6.3 Example Ruby code.......................... 121 6.4 Example command to install a line breakpoint........... 126 6.5 Implementation of CyclicAssumption (simplified)......... 133 7.1 An example use of Ruby's ObjectSpace.each_object method.. 146 7.2 Sketch of an API for safepoints................... 151 7.3 Example locations for a call to poll() ............... 151 7.4 Using guest-language safepoints to implement each_object ... 153 7.5 Using guest-language safepoints to implement Thread.kill ... 153 7.6 Safepoint action to print stack backtraces.............. 154 7.7 Safepoint action to enter a debugger................ 155 7.8 Simple implementation of guest-language safepoints using a volatile flag................................... 156 7.9 Implementation of guest-language safepoints with an Assumption. 157 7.10 Example code for detailed analysis of the generated machine code. 162 7.11 Generated machine code for the api, removed and switchpoint safe- point configurations.......................... 163 7.12 Generated machine code for the volatile safepoint configuration.. 164 8.1 Function to store a value into an array as part of the Ruby API.. 170 8.2 Excerpt of the ruby.h implementation................ 181 8.3 Calling rb ary store from C..................... 182 8.4 IRB session using TruffleC inline to call a library routine...... 191 A.1 Parallel Lee's algorithm using a global SyncVar .......... 204 A.2 Parallel Lee's algorithm using transactional memory........ 205 A.3 Parallel Lee's algorithm using DFScala............... 206 8 List of Tables 6.1 Overhead of set trace func .................... 136 6.2 Overhead of setting a breakpoint on a line never taken (lower is better)................................. 138 6.3 Overhead of setting breakpoint with a constant condition (lower is better)................................. 139 6.4 Overhead of setting breakpoint with a simple condition (lower is better)................................. 140 6.5 Summary of overheads (lower is better)............... 140 A.1 Mean algorithm running time (seconds)............... 208 A.2 Whole program running time on 8 hardware threads (seconds).. 209 A.3 Code metrics............................. 209 9 10 List of Figures 1.1 Recommended minimum chapter reading order........... 33 2.1 A sub-view ofIR resulting from Listing 2.13............ 47 2.2 Performance of the Acid Test benchmark in existing implementa- tions of Ruby............................. 48 2.3 Performance of the Acid Test benchmark in JRuby+Truffle.... 48 3.1 Self-optimising ASTs in Truffle [114]................ 61 3.2 Graal and Truffle deoptimising from optimised machine code back to the AST, and re-optimising [114]................. 63 3.3 Summary performance of different Ruby implementations on all of our benchmarks............................ 69 3.4 Summary performance of different Ruby implementations on all of our benchmarks, excluding JRuby+Truffle............. 69 3.5 Summary performance of historical versions of MRI on all of our benchmarks.............................. 70 3.6 Summary performance of JRuby with invokedynamic ....... 70 4.1 Example