Multi-Language Software Development

Multi-Language Software Development with Marcel Toele Summer 2007 Master's Thesis in Computer Science FACULTEIT DER NATUURWETENSCHAPPEN,WISKUNDE EN INFORMATICA University of Amsterdam Supervisor: prof. dr. P. Klint Abstract Software projects tend to grow to exist of large quantities of program code. Most of this code will probably be support code for secondary features. It is known that some programming languages are more concise than others (i.e. they require less lines of program code to implement a function point). Most notably, higher-level ‘‘dynamic’’ programming languages tend to be more concise than lower-level ‘‘static’’ programming languages, making them suitable to be used for such support code. However, lower-level programming languages certainly have advantages of their own, most notably in terms of execution speed, and tighter integration with the underlying operating system and hardware. Many attempts exist to make software components, written in different programming languages, interoperate with eachother. We divide these into two groups, distributed and non-distributed approaches. Our research has focussed on the non-distributed approaches. To interface with an existing software component (a shared library in our case), often a solution that uses such a non-distributed approach requires a seperately compiled shared library, which is the result of extensive programming, or, the solution uses a specification. The programmatic solutions that we have come across, because of their programmatic nature, made it hard to establish the interface, and, again because of their programmatic nature, they bare the risk of becoming very hard to maintain. The solutions that use a specification are easier to use and maintain, but are often too simplistic or incomplete to effectively interface with any shared library, regardless of the way such library is intented to be used. In this thesis, we argue that it is beneficial to interface the higher-level ‘‘dynamic’’ programming languages to lower-level ‘‘static’’ programming languages, and we propose a way for interfacing these in such a way, that the interface is easy to establish, maintainable, efficient in use, and effective in all required circumstances. One of the key benefits is that we attempt to interface the languages in run-time, without requiring a separately compiled ‘‘binding’’ library, making it a more natural interfacing approach for an interpreted, higher-level, ‘‘dynamic’’ programming language. i Table of Contents 1 Introduction 1 2 Why Multi-language Development? 5 2.1 Terminology . 5 2.1.1 Multi-language Interoperability . 5 2.1.2 Multilingual versus Polylingual Interoperability . 6 2.1.3 Separation of Concerns . 6 2.1.4 Information Hiding . 6 2.1.5 Encapsulation . 6 2.1.6 Holy Grail . 7 2.1.7 Seamlessness . 7 2.2 Scenario 1: Productivity . 8 2.2.1 Terminology . 8 2.2.2 Software Productivity . 8 2.2.3 Functional Size Measuring . 9 2.2.4 Influences on Software Productivity . 10 2.2.5 Software Productivity and Programming Languages . 10 2.2.6 80/20 Conjecture . 13 2.2.7 The Scenario . 14 2.3 Scenario 2: Multi-Disciplinary Software Development . 14 2.3.1 Higher-level versus Lower-level: Multi-Disciplinary Point of View . 15 2.3.2 Power to the People . 16 iii iv TABLE OF CONTENTS 2.4 Scenario 3: Best of Both Worlds . 18 2.4.1 Language Features . 18 2.4.2 The Scenario . 20 3 Related Work 21 3.1 Distributed Versus Non-Distributed . 21 3.1.1 Distributed Approaches . 21 3.1.2 Non-Distributed Approaches . 22 3.2 A Closer Look . 24 3.2.1 Java . 24 3.2.2 Common Language Infrastructure . 25 3.3 Are Dynamic Languages the Future? . 27 4 Goals and Requirements 29 4.1 Field of Application . 29 4.2 Productivity Revisited . 30 4.3 Interface Creation and Maintainability . 31 4.3.1 Interface Creation . 31 4.3.2 Maintainability . 31 4.4 Efficiency in Use . 32 4.4.1 Increasing Seamlessness . 32 4.4.2 Dynamic Boundaries . 33 4.4.3 Portability . 33 4.5 Multi-Disciplinary Software Development Revisited . 34 4.6 Effectivity . 35 4.7 Known Issues in Multi-language development . 35 4.7.1 Lack of Documentation . 36 4.7.2 Performance . 36 4.7.3 Memory Management . 37 4.7.4 Threading . 39 TABLE OF CONTENTS v 5 Language Selection 41 5.1 Selection Criteria . 41 5.1.1 Productivity versus Execution Speed . 41 5.1.2 Feature Set . 42 5.1.3 Easy Language . 42 5.2 Ruby and C . 42 5.3 Consequences for the Goals and Requirements . 43 5.3.1 Shared Libraries . 44 5.3.2 Run-time . 44 5.4 Considered Alternatives . 45 5.4.1 Java & The Java Native Interface . 45 5.4.2 .Net and The Common Language Infrastructure . 47 5.5 Existing Work . 50 5.5.1 Ruby Inline . 50 5.5.2 Ruby Extension API . 51 5.5.3 Ruby/DL . 51 5.6 Main Thesis . 52 6 DLX 53 6.1 Field of Application . 53 6.2 Design Decisions . 54 6.2.1 Ruby-centred Development . 54 6.2.2 Run-time . 54 6.2.3 Specification not Programming . 54 6.3 The DLX Architecture . 57 6.3.1 Architectural Overview . 57 6.3.2 Library Locator/Loader . 58 6.3.3 Symbol Binder . 59 6.3.4 Function Caller . 61 vi TABLE OF CONTENTS 6.3.5 Type Database . 63 6.3.6 Callback Caller . 64 6.4 Type Mapping: From C to Ruby and Back . 66 6.4.1 C Types . 66 6.4.2 Ruby Types . 67 6.4.3 Integer Types and Floating-point types . 68 6.4.4 Pointer Types . 69 6.4.5 Structure Types . 71 6.4.6 Union Types . 76 6.4.7 Array Types . 77 6.4.8 Typedef Names . 82 6.4.9 Enumerated Types . 83 6.4.10 Function Types . 83 6.4.11 The Void Type . 85 6.4.12 Type Casting . 85 6.5 Memory layout reconstruction . ..

Load more