Techniques in Active and Generic Software Libraries

TECHNIQUES IN ACTIVE AND GENERIC SOFTWARE LIBRARIES A Dissertation by JACOB NYFFELER SMITH Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY May 2010 Major Subject: Computer Science TECHNIQUES IN ACTIVE AND GENERIC SOFTWARE LIBRARIES A Dissertation by JACOB NYFFELER SMITH Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Approved by: Chair of Committee, Jaakko Järvi Committee Members, Gabriel Dos Reis Thomas Ioerger Paul Gratz Head of Department, Valerie Taylor May 2010 Major Subject: Computer Science iii ABSTRACT Techniques in Active and Generic Software Libraries. (May 2010) Jacob Nyffeler Smith, B.S., The University of Texas at Austin; B.A., The University of Texas at Austin Chair of Advisory Committee: Dr. Jaako Järvi Reusing code from software libraries can reduce the time and effort to construct software systems and also enable the development of larger systems. However, the benefits that come from the use of software libraries may not be realized due to limitations in the way that traditional software libraries are constructed. Libraries come equipped with application programming interfaces (API) that help enforce the correct use of the abstractions in those libraries. Writing new components and adapting existing ones to conform to library APIs may require substantial amounts of “glue” code that potentially affects software’s efficiency, robustness, and ease-of-maintenance. If, as a result, the idea of reusing functionality from a software library is rejected, no benefits of reuse will be realized. This dissertation explores and develops techniques that support the construction of software libraries with abstraction layers that do not impede efficiency. In many situations, glue code can be expected to have very low (or zero) performance overhead. In particular, we describe advances in the design and development of active libraries — software libraries that take an active role in the compilation of the user’s code. Common to the presented techniques is that they may “break” a library API (in a controlled manner) to adapt the functionality of the library for a particular use case. The concrete contributions of this dissertation are: a library API that supports iterator selection in the Standard Template Library, allowing generic algorithms to find the most suitable traversal through a container, allowing (in one case) a 30-fold iv improvement in performance; the development of techniques, idioms, and best prac- tices for concepts and concept_maps in C++, allowing the construction of algorithms for one domain entirely in terms of formalisms from a second domain; the construction of generic algorithms for algorithmic differentiation, implemented as an active library in Spad, language of the Open Axiom computer algebra system, allowing algorithmic differentiation to be applied to the appropriate mathematical object and not just concrete data-types; and the description of a static analysis framework to describe the generic programming notion of local specialization within Spad, allowing more sophisticated (value-based) control over algorithm selection and specialization in categories and domains. We will find that active libraries simultaneously increase the expressivity of the underlying language and the performance of software using those libraries. v To Stephanie Hope & Kinsey Dorothea Danger Smith vi ACKNOWLEDGMENTS I never had any intention of learning to program; in fact, I was threatened with harm about ten years ago when making fun of programming and programmers; so, I suppose, it is with a certain amount of auto-schadenfreude that I find my self at the end of a multiyear journey into computer science — especially software engineering and the development of software libraries. I would like to thank my adviser Dr. Jaakko Järvi first and foremost. Little does he know that it was reading his papers (and similar papers from the OSL) and poring over the design of his Boost libraries that finally prompted me to apply for school. I want to thank Mat Marcus for sitting in a 512 ft3 cell with Dr. Järvi and poring over my algorithms in order to make a large portion of this dissertation possible. Of course, working with Dr. Järvi would never have been possible without Dr. Thomas Ioerger’s patience with myself (and Erik, my co-conspirator in generic programming); I will forever be grateful to Tom for suggesting that—just possibly—computational biology was not the field for me. Dr. Gabriel Does Reis has, effectively, been the co-chair for this dissertation. I know I try his patience at almost all times, but he has been an invaluable teacher: each semester spent in his classes is, I feel, equivalent to whole Bachelor’s and Mas- ter’s degrees. I would also like to thank Dr. Paul Gratz for jumping on board this dissertation at such a late date and being so enthusiastic about its topic, regardless of the extremely short times I gave him to review and revise! I would like to thank two different sources for support: Dr. James F. Leary (and the whole team at the old UTMB MCU) for his long-time support and encouragement and for telling me that if statistical software doesn’t work, don’t complain, learn to program and fix it! And, also, To Dr. Bart Childs for convincing me to come to vii A&M. The second source are the members of Jaakko’s lab (John Freeman, Xiaolong Tang), even though I don’t think any one of us managed to actually work on a project together. Hand-in-hand I would like to thank everyone in Dr. Bjarne Stroustrup’s lab (including Bjarne for being such a good sport): Luke Wagner for helping me to understand logic and language theory; Yuriy Solodkyy for our shouting matches and appreciation of books & coffee at the library; Dr. Damian Dechev and Peter Pirkelbauer for being too cool to be in such a geeky field. I owe a large debt to Yue Li not only for shepharding this dissertation to a close, but also for being a good friend my last year-or-so at TAMU. Finally, I want to thank and send my love to my family for their infinite patience; to my wife, Stephanie, my daughter, Kinsey; to my brother for low-rents; and lastly, to my parents, who never once questioned why, after finishing my first degree in a little less than three years, I stayed in school for another ten. viii TABLE OF CONTENTS CHAPTER Page I INTRODUCTION .......................... 1 A. Background . 3 1. Permeable interfaces . 5 2. Composition and adaptation . 7 3. Library-directed transformation . 8 B. Dissertation statement . 10 II A PARAMETRIZED ITERATOR REQUEST FRAMEWORK FOR GENERIC LIBRARIES ................... 12 A. Introduction . 12 B. Background and motivation . 17 C. Iterator request framework . 19 D. Examples . 22 1. Timings for images . 22 2. Alternate run-time characteristics for std::find . 25 3. Alternate run-time characteristics for std::find II . 26 4. Protein crystallography . 28 a. Background information on PX . 29 b. Density interpolation algorithm . 31 c. Generic density interpolation using iterator selection 34 E. Computing with capabilities . 37 F. Conclusion and future work . 39 1. Readability . 40 III MULTILAYER LIBRARY COMPOSITION ........... 43 A. Introduction . 43 B. Background . 45 1. From C++ 2003 to ConceptC++ . 46 2. Generic programming in ConceptC++ . 48 C. Cross-domain composition . 54 1. Background of GIL and BGL . 56 2. GIL–BGL composition . 58 3. Performance results of the BGL to GIL adaptation . 65 ix CHAPTER Page D. Multi-layer composition . 70 1. Background of the MTL . 74 2. Implementation of the ncuts algorithm . 75 3. GIL–BGL composition for MTL . 76 4. BGL–MTL composition . 77 5. Results of the BGL to GIL to MTL adaptation . 84 E. Concept maps and other adaptation mechanisms . 85 F. Adaptation and overloading . 94 G. Conclusions . 96 IV ALGORITHMIC DIFFERENTIATION IN AXIOM ....... 99 A. Introduction . 100 B. The OpenAxiom system . 103 C. Spad programming language . 104 1. Syntax . 104 2. Language features . 110 3. Semantics . 110 a. Operational semantics . 110 b. Denotational semantics . 111 D. Elements of algebraic theory of algorithmic differentiation . 113 1. Algorithmic differential rings . 113 2. Strategies of derivative evaluation . 115 3. Control flow and differentiability . 116 E. The Spad compiler . 117 F. Implementation . 119 1. Transformation to simple form . 120 2. First order prolongation . 120 3. Initial environment . 122 4. Forward mode . 123 5. Examples . 124 a. The GRADIENT paper . 124 b. Tchebychev polynomials . 127 G. Employing generic algorithmic differentiation . 128 H. Conclusion . 129 1. Related work . 129 2. Future work . 130 V LOCAL SPECIALIZATION FOR OPENAXIOM ........ 131 x CHAPTER Page A. Introduction . 132 B. An overview . 134 C. The Spad programming language . 135 1. Syntax . 137 2. An internal language . 140 3. Translation of Spad to internal language . 143 D. User-defined predicates . 147 E. Static analysis of categories . 149 1. The structure of the abstract domain . 150 2. Abstract evaluation of syntactic forms . 150 3. Expression explosion . 153 4. Reduction of abstract stores . 154 F. Implementation . 155 G. Related work . 157 H. Conclusion . 158 VI CONCLUSION ........................... 160 REFERENCES ................................... 164 VITA ........................................ 180 xi LIST OF FIGURES FIGURE Page 1 Example code fragment for the active library MTL2. ......... 5 2 The iterator and const_ metafunctions for the framework. ..... 20 3 The set of begin functions for the framework. ............. 21 4 The set of tagged begin functions for the framework. ........ 22 5 Iterator execution times. ......................... 24 6 The three types of symmetry. ...................... 31 7 The eight-point interpolation algorithm. ................ 33 8 The min_element generic algorithm. .................. 51 9 The ForwardIterator concept (simplified from the one in the STL). 52 10 A call to the generic min_element function.

Load more