Parametric Polymorphism for Software Component Architectures and Related Optimizations
Total Page:16
File Type:pdf, Size:1020Kb
Parametric Polymorphism for Software Component Architectures and Related Optimizations by Cosmin Eugen Oancea Graduate Program in Computer Science Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Faculty of Graduate Studies The University of Western Ontario London, Ontario July, 2005 c Cosmin Eugen Oancea 2005 THE UNIVERSITY OF WESTERN ONTARIO FACULTY OF GRADUATE STUDIES CERTIFICATE OF EXAMINATION Chief Advisor Examining Board Advisory Committee The thesis by Cosmin Eugen Oancea entitled Parametric Polymorphism for Software Component Architectures and Related Optimizations is accepted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Date Chair of Examining Board ii Abstract Parametric polymorphism has become a common feature of mainstream programming languages, but software component architectures have lagged behind and do not sup- port this feature. The immediate consequence is that applications cannot naturally combine the functionality exposed by various parameterized modules, if it happens that the implementation language differs. This significant problem surfaced first and most acutely in the computer algebra community, where parametric polymorphism is heavily used for the specification and enforcement of the algebraic interfaces and in the implementation of algorithms that work over various coefficient rings or fields. Com- plex, specialized mathematical libraries, servicing disjoint areas are implemented in various languages and therefore they cannot yet work together to attack increasingly difficult problems. This thesis examines the problem of accommodating parametric polymorphism, and related optimizations in a multi-language, distributed setting. We report on a first experiment, where we developed the Alma framework that allows Aldor libraries to extend Maple in a effective and natural way, and constitutes a new approach to structuring computer algebra systems. The motivation for this experiment are twofold: First, we are interested in understanding the issues that arise in matching the compile-time parametric polymorphism of Aldor’s dependent types iii with the dynamic parametric polymorphism of Maple’s module-producing functions, and in matching the Aldor’s strongly type system with Maple’s dynamically typed system. Second, we are interested in the practical problem of using Aldor as an extension mechanism for the popular Maple computer algebra system. The details of generics, templates or functors, as they are variously called, dif- fer significantly in different programming languages. We investigated how to resolve different binding times and parametric polymorphism semantics in a range of rep- resentative programming languages, and identified a common ground that can be suitably mapped to different language bindings. We explore the possibility of a systematic solution for parametric polymorphism, that should encompass many languages in a simple way. We present a generic com- ponent architecture extension that provides support for parameterized components, and can be easily adapted to work on top of various software component architectures in use today: corba, jni, dcom. We have implemented and tested our extension on top of corba.We present Generic Interface Definition Language (gidl), an extension to corba-idl, supporting generic types, and our language bindings for C++, Java, and Aldor. We describe our implementation of gidl, consisting of a gidl to idl compiler and tools for generating linkage code under the language bindings. gidl captures a very general notion of parametric polymorphism such that it can meaningfully be supported by various languages, and has the power to model the structure and semantics of system’s components. To test the effectiveness of our model for generics, we have investigated how to expose C++’s STL and Aldor’s BasicMath libraries to a multi-language environment, and discuss our mappings in the context of automatic library interface generation. iv Our work in the context of exposing generic libraries to a multi-language, po- tentially distributed environment has revealed several performance issues. First, as different components are separately compiled, the traditional compiler optimizations, such as inlining and parallelization, will fail to perform aggressively. Second, the over- head introduced by the inter-process communication stalls can be quite significant. Finally, this thesis explores speculative optimizations in the attempt to speed up the application performance in distributed environments. v Acknowledgments First of all, I would like to express my sincere gratitude to my supervisor, Professor Stephen Watt, for his valuable guidance, continuous interest, and support for this research work. I wish further to express my sincere appreciation to Jason Selby and Professor Mark Giesbrecht for the many hints, valuable discussions, comments and collaboration related to the “Compiler Middleware” NSERC strategic grant projects. I am very grateful to Professor Hanan Lutfiyya for carefully reading and sharing her expertise in connection to three of my papers and for her helpful suggestions, support and feedback. I would like to thank my colleagues and faculty at the ORCCA lab, Western Ontario University of whom many contributed to this thesis in one way or another. I also express my heartfelt thanks to all Faculty, staff, and fellow graduate students in the Computer Science Department. Above all, I thank my family for their encouragements, support, and love during all these years. A further special thanks goes to my mother, Lucia Baltag, for being to me a model of determination, perseverance, and hard work, and to my wife Monica Ulici for her patience and understanding during the course of my graduate studies. vi Chapters of this thesis are based on co-authored papers accepted to the 2005 Inter- national Symposium on Symbolic and Algebraic Computation, the 2005 International Conference on Object-Oriented Programming, Systems, Languages and Applications, and the 2005 International Conference on Parrallel and Distributed Processing Tech- niques and Applications. I thank again the co-authors Stephen Watt, Jason Selby, and Mark Giesbrecht for the fruitful collaboration. vii Table of Contents 1 Introduction 1 2 Background and Related Work 7 2.1 Early Experiment . 8 2.1.1 The Aldor Programming Language . 8 2.1.2 C++/Aldor Interoperability . 10 2.2 Parametric Polymorphism Semantics . 12 2.2.1 A Formal Introduction to Types Systems and Parametric Polymorphism . 14 2.2.1.1 Informal Nomenclature for Typing, Execution Errors, and Related Concepts . 15 2.2.1.2 Concepts Related to Formalizing a Language . 18 2.2.1.3 System F<: ....................... 22 2.2.2 Parametric Polymorphism Semantics and Implementation in Several Languages . 28 2.2.2.1 Modula-3, C++, Ada . 28 viii 2.2.2.2 Java, C# ......................... 31 2.2.2.3 Standard ML and Related Languages . 33 2.2.2.4 Concluding Remarks . 35 2.3 Mainstream Software Component Architectures . 36 2.3.1 Common Object Requests Broker Architecture . 37 2.3.1.1 Interface Definition Language (IDL) . 38 2.3.1.2 Overview of CORBA Architectural Components . 42 2.3.2 Microsoft Component Technology . 45 2.3.2.1 Component Object Model (COM) . 46 2.3.2.2 Distributed Component Object Model (DCOM) . 48 2.3.2.3 .NET Framework . 50 2.3.3 Java Native Interface (JNI) . 54 2.4 Thread Level Speculation . 57 2.4.1 Related Work in Thread Level Speculation . 59 2.4.2 Our Non-Distributed TLS Approach . 62 3 ALMA 66 3.1 Chapter Introduction . 66 3.2 Example . 70 3.3 Aspects of Maple and Aldor . 72 3.4 Alma Design . 74 ix 3.4.1 Rationale of the Design . 75 3.4.2 Example of Correspondence . 77 3.5 The Maple Stub . 80 3.5.1 Mapping Rules . 81 3.5.2 Foreign Object Layout . 86 3.5.3 Type Checking . 88 3.6 The C and Aldor Stubs . 89 3.7 Example Implementation . 92 3.8 Chapter Conclusions . 94 4 GIDL 96 4.1 Chapter Introduction . 96 4.2 Motivation and Design Point . 100 4.3 Generic IDL . 102 4.3.1 Rationale of the Design . 103 4.3.2 The GIDL Parametric Polymorphism Semantics . 104 4.3.3 GIDL’s Grammar and Consistency Checks . 108 4.3.4 Well-Formedness Type Rules . 111 4.3.5 GIDL to IDL Transformation . 114 4.4 High Level Ideas for Mapping Qualified Generic Types . 116 x 4.4.1 Basic Ideas . 116 4.4.2 Mapping the Export-Based Qualification . 118 4.4.3 Discussion . 122 4.5 The GIDL Base Application Architecture . 124 4.5.1 The GIDL Extension Architecture . 124 4.5.2 The User’s Perspective . 129 4.6 Language Bindings . 131 4.6.1 GIDL to C++ Mapping . 131 4.6.1.1 High-Level Mapping Ideas . 132 4.6.1.2 Wrapper Stub Object Model . 134 4.6.2 GIDL to Java Mapping . 136 4.6.3 GIDL to Aldor Mapping . 139 4.7 GIDL and Library Translations . 142 4.7.1 Accessing the C++ STL in a Multi-Language Environment . 143 4.7.1.1 Key Features in the Design of STL . 143 4.7.1.2 STL’s GIDL Specification . 144 4.7.1.3 Implementation Issues . 146 4.7.2 Accessing Aldor’s BasicMath Library in a Multi-Language Environment . 148 4.8 Chapter Conclusions . 154 xi 5 Distributed Models of Thread Level Speculation 156 5.1 Chapter Introduction . 156 5.2 Distributed Applications of Thread-Level Speculation . 158 5.2.1 Overview . 159 5.2.2 Distributed Speculation Model . 160 5.2.3 Distributed Speculative-Inlining Model . 166 5.3 Results . 171 5.4 Chapter Conclusion . 175 6 Conclusions 176 References 179 xii List of Tables 5.1 Distributed TLS 1st Architecture (overlapping communication) Nr = client thread pool size, G = “remote” method granularity (instructions) nMc speed-up compared to sequential. n = no. machines, c = client version nMcR as above, but with 1% rollback rate. 172 5.2 Distributed TLS 2nd Architecture (“inlining”-like speculation) G = “remote” method granularity (instructions) SS = slave sequence size, nMc speed-up compared to sequential.