Common Language Infrastructure for Research (Clir): Editing and Optimizing .Net Assemblies
Total Page:16
File Type:pdf, Size:1020Kb
COMMON LANGUAGE INFRASTRUCTURE FOR RESEARCH (CLIR): EDITING AND OPTIMIZING .NET ASSEMBLIES A Thesis by Shawn H. Windle Submitted to the Graduate School Appalachian State University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE December 2012 Major Department: Computer Science COMMON LANGUAGE INFRASTRUCTURE FOR RESEARCH (CLIR): EDITING AND OPTIMIZING .NET ASSEMBLIES A Thesis by Shawn H. Windle December 2012 APPROVED BY: Dr. James B. Fenwick Jr. Chairperson, Thesis Committee Dr. Cindy Norris Member, Thesis Committee Dr. James T. Wilkes Member, Thesis Committee Dr. James T. Wilkes Chairperson, Computer Science Edelma D. Huntley Dean, Research and Graduate Studies Copyright°c Shawn H. Windle 2012 All Rights Reserved ABSTRACT COMMON LANGUAGE INFRASTRUCTURE FOR RESEARCH (CLIR): EDITING AND OPTIMIZING .NET ASSEMBLIES. (December 2012) Shawn H. Windle, Appalachian State University Thesis Chairperson: Dr. James B. Fenwick Jr. In 2002, Microsoft released the .NET Framework as it’s implementation of the Common Language Infrastructure (CLI). The subsequent release of Mono, an open-source implemen- tation of the CLI, allowed the .NET Framework audience to also include Max OS X, Linux and Unix users. These tools enabled high-level .NET development, but low-level researchers such as code optimizers have few tools available to manipulate .NET assembly files. This thesis presents the Common Language Infrastructure for Research (CLIR) comprised of three components: the Common Language Engineering Library (CLEL), the Common Language Optimizing Framework (CLOT), and a suite of utility applications. CLIR enables researchers to experiment with and develop tools for the .NET Framework languages. CLEL provides the means to read, edit and write low-level .NET assemblies. CLOT uses CLEL to provide a framework for code optimization including algorithms and data structures for three traditional optimizations. Evaluations of program performance demonstrate meaningful decrease in program execution time due to the application of these optimizations, thus validating CLIR and enabling further .NET optimization research. iv ACKNOWLEDGEMENTS I would like to express my profound appreciation to my thesis chairperson, Dr. James B. Fenwick Jr., for all of the hard work and time he put into helping me, not just with this thesis, but also for all of the classes I took at Appalachian State University with him. I would also like to thank Dr. Cindy Norris, Professor Kenneth Jacker and Dr. James Wilkes for all of their advice and help. I would also like to acknowledge that without the support from my family this thesis would not have been possible. v Contents Abstract iv Acknowledgement v List of Figures viii List of Tables ix 1 Introduction 1 2 Background 3 2.1 The Common Language Infrastructure . 3 2.2 Overview of the Assembly .text section . 7 2.3 Common Intermediate Language Instruction Format . 19 2.4 Related Work . 24 3 CLIR: Common Language Infrastructure for Research 26 3.1 CLEL: Common Language Engineering Library . 27 3.2 CLOT: Common Language Optimization Toolset . 32 3.3 User Applications . 35 4 Optimizations Background and Implementation 38 4.1 Optimization Overview and Goals . 38 4.2 Branch Instruction Replacement . 39 4.2.1 Peephole Optimization Background . 39 4.2.2 Branch Instruction Replacement Implementation . 40 4.3 Constant Propagation . 47 4.3.1 Constant Propagation Background . 47 4.3.2 Constant Propagation Implementation . 50 4.4 Method Inlining . 53 4.4.1 Method Inlining Background . 53 4.4.2 Method Inlining Implementation . 55 5 Optimization Tests 58 5.1 Introduction and Testing Environment . 59 5.2 One Pass Branch Instruction Replacement . 60 5.3 Constant Propagation . 63 5.4 Method Inlining . 66 vi 6 Conclusion and Future Work 69 6.1 Conclusion . 69 6.2 Future Work . 71 Bibliography 72 A Bubble Sort in C# 76 B BubbleSort.exe hexadecimal dump 78 C Assembly File Format 84 Vita 120 vii List of Figures 3.1 CLIR overview . 27 3.2 CLEL class . 27 3.3 CLEL overview . 28 3.4 ICLELReader interface . 28 3.5 ICLELWriter interface . 28 3.6 CLELAssembly class . 29 3.7 Token class . 30 3.8 ClassDescriptor class . 30 3.9 BuiltInType class . 30 3.10 MethodDescriptor class . 30 3.11 CLELInstruction class . 31 3.12 CLOT namespaces overview . 32 3.13 IOptimization interface . 33 3.14 OptimizationConfiguration class . 33 3.15 OptimizationController class . 36 3.16 Optimization Scheduling Tool . 37 3.17 OptimizationKey class . 37 4.1 Peephole examples . 39 4.2 BIR example . 41 4.3 BIR algorithm (version 1) . 41 4.4 Branch body . 42 4.5 Peephole BIR algorithm (version 1.1) . 42 4.6 Branch body, with missed replacement . 43 4.7 Converge BIR algorithm (version 2) . 43 4.8 One Pass BIR algorithm (version 3) . 45 4.9 Branch lists example . 45 4.10 Constant propagation example . 47 4.11 Constant propagation can uncover other optimization opportunities . 48 4.12 Reaching definitions algorithm . 48 4.13 A gen and kill set example . 50 4.14 Example assignment statement decomposed . 51 4.15 Constant propagation example . 52 4.16 Rules for method inlining . 55 C.1 PE and assembly file format . 84 C.2 Assembly .text section . 84 C.3 Overview of .rsrc section . 112 viii List of Tables 2.1 Method header - public BubbleSort() . 8 2.2 Stream header - #GUID . 9 2.3 #˜ stream . 10 2.4 TypeRef table . 11 2.5 TypeDef table . 12 2.6 Field table . 13 2.7 MethodDef table . 14 2.8 Param table . 15 2.9 MemberRef table . 16 2.10 StandAloneSig table . 16 2.11 #Strings stream . 17 2.12 #US stream . 18 2.13 #Blob stream . 18 2.14 Unconditional branch examples . 20 2.15 Conditional branch examples . 20 2.16 Load examples . 21 2.17 Method call examples . 22 5.1 Relative program sizes . 59 5.2 Branches replaced for Game of Life . 61 5.3 Branches replaced for Huffman . 61 5.4 Branches replaced for ZipFolder . 62 5.5 Code size changes (in bytes) . 62 5.6 Average execution time (in seconds) for BIR . 62 5.7 ZipFile C# code fragment . 63 5.8 ZipFile CIL . 63 5.9 ZipFile Assembly . 63 5.10 Number of Constants Propagated . 64 5.11 Average execution time (in seconds) . 64 5.12 Huffman Coding VB.NET code fragment . 65 5.13 FindProbabilitiesForSymbols CIL . 65 5.14 Average execution time (in seconds) for Method Inlining at 10% and 20% . 67 5.15 Average execution time (in seconds) for Method Inlining at 30%, 40% and 50% 67 C.1 DOS header . 85 C.2 DOS stub instructions . 87 C.3 PE header . 88 C.4 Standard fields . 88 ix C.5 NT specific . 90 C.6 Data directories . ..