Multi-Language Software Development

Total Page:16

File Type:pdf, Size:1020Kb

Multi-Language Software Development Multi-Language Software Development with Marcel Toele Summer 2007 Master's Thesis in Computer Science FACULTEIT DER NATUURWETENSCHAPPEN,WISKUNDE EN INFORMATICA University of Amsterdam Supervisor: prof. dr. P. Klint Abstract Software projects tend to grow to exist of large quantities of program code. Most of this code will probably be support code for secondary features. It is known that some programming languages are more concise than others (i.e. they require less lines of program code to implement a function point). Most notably, higher-level ‘‘dynamic’’ programming languages tend to be more concise than lower-level ‘‘static’’ programming languages, making them suitable to be used for such support code. However, lower-level programming languages certainly have advantages of their own, most notably in terms of execution speed, and tighter integration with the underlying operating system and hardware. Many attempts exist to make software components, written in different programming languages, interoperate with eachother. We divide these into two groups, distributed and non-distributed approaches. Our research has focussed on the non-distributed approaches. To interface with an existing software component (a shared library in our case), often a solution that uses such a non-distributed approach requires a seperately compiled shared library, which is the result of extensive programming, or, the solution uses a specification. The programmatic solutions that we have come across, because of their programmatic nature, made it hard to establish the interface, and, again because of their programmatic nature, they bare the risk of becoming very hard to maintain. The solutions that use a specification are easier to use and maintain, but are often too simplistic or incomplete to effectively interface with any shared library, regardless of the way such library is intented to be used. In this thesis, we argue that it is beneficial to interface the higher-level ‘‘dynamic’’ programming languages to lower-level ‘‘static’’ programming languages, and we propose a way for interfacing these in such a way, that the interface is easy to establish, maintainable, efficient in use, and effective in all required circumstances. One of the key benefits is that we attempt to interface the languages in run-time, without requiring a separately compiled ‘‘binding’’ library, making it a more natural interfacing approach for an interpreted, higher-level, ‘‘dynamic’’ programming language. i Table of Contents 1 Introduction 1 2 Why Multi-language Development? 5 2.1 Terminology . 5 2.1.1 Multi-language Interoperability . 5 2.1.2 Multilingual versus Polylingual Interoperability . 6 2.1.3 Separation of Concerns . 6 2.1.4 Information Hiding . 6 2.1.5 Encapsulation . 6 2.1.6 Holy Grail . 7 2.1.7 Seamlessness . 7 2.2 Scenario 1: Productivity . 8 2.2.1 Terminology . 8 2.2.2 Software Productivity . 8 2.2.3 Functional Size Measuring . 9 2.2.4 Influences on Software Productivity . 10 2.2.5 Software Productivity and Programming Languages . 10 2.2.6 80/20 Conjecture . 13 2.2.7 The Scenario . 14 2.3 Scenario 2: Multi-Disciplinary Software Development . 14 2.3.1 Higher-level versus Lower-level: Multi-Disciplinary Point of View . 15 2.3.2 Power to the People . 16 iii iv TABLE OF CONTENTS 2.4 Scenario 3: Best of Both Worlds . 18 2.4.1 Language Features . 18 2.4.2 The Scenario . 20 3 Related Work 21 3.1 Distributed Versus Non-Distributed . 21 3.1.1 Distributed Approaches . 21 3.1.2 Non-Distributed Approaches . 22 3.2 A Closer Look . 24 3.2.1 Java . 24 3.2.2 Common Language Infrastructure . 25 3.3 Are Dynamic Languages the Future? . 27 4 Goals and Requirements 29 4.1 Field of Application . 29 4.2 Productivity Revisited . 30 4.3 Interface Creation and Maintainability . 31 4.3.1 Interface Creation . 31 4.3.2 Maintainability . 31 4.4 Efficiency in Use . 32 4.4.1 Increasing Seamlessness . 32 4.4.2 Dynamic Boundaries . 33 4.4.3 Portability . 33 4.5 Multi-Disciplinary Software Development Revisited . 34 4.6 Effectivity . 35 4.7 Known Issues in Multi-language development . 35 4.7.1 Lack of Documentation . 36 4.7.2 Performance . 36 4.7.3 Memory Management . 37 4.7.4 Threading . 39 TABLE OF CONTENTS v 5 Language Selection 41 5.1 Selection Criteria . 41 5.1.1 Productivity versus Execution Speed . 41 5.1.2 Feature Set . 42 5.1.3 Easy Language . 42 5.2 Ruby and C . 42 5.3 Consequences for the Goals and Requirements . 43 5.3.1 Shared Libraries . 44 5.3.2 Run-time . 44 5.4 Considered Alternatives . 45 5.4.1 Java & The Java Native Interface . 45 5.4.2 .Net and The Common Language Infrastructure . 47 5.5 Existing Work . 50 5.5.1 Ruby Inline . 50 5.5.2 Ruby Extension API . 51 5.5.3 Ruby/DL . 51 5.6 Main Thesis . 52 6 DLX 53 6.1 Field of Application . 53 6.2 Design Decisions . 54 6.2.1 Ruby-centred Development . 54 6.2.2 Run-time . 54 6.2.3 Specification not Programming . 54 6.3 The DLX Architecture . 57 6.3.1 Architectural Overview . 57 6.3.2 Library Locator/Loader . 58 6.3.3 Symbol Binder . 59 6.3.4 Function Caller . 61 vi TABLE OF CONTENTS 6.3.5 Type Database . 63 6.3.6 Callback Caller . 64 6.4 Type Mapping: From C to Ruby and Back . 66 6.4.1 C Types . 66 6.4.2 Ruby Types . 67 6.4.3 Integer Types and Floating-point types . 68 6.4.4 Pointer Types . 69 6.4.5 Structure Types . 71 6.4.6 Union Types . 76 6.4.7 Array Types . 77 6.4.8 Typedef Names . 82 6.4.9 Enumerated Types . 83 6.4.10 Function Types . 83 6.4.11 The Void Type . 85 6.4.12 Type Casting . 85 6.5 Memory layout reconstruction . ..
Recommended publications
  • CHERI C/C++ Programming Guide
    UCAM-CL-TR-947 Technical Report ISSN 1476-2986 Number 947 Computer Laboratory CHERI C/C++ Programming Guide Robert N. M. Watson, Alexander Richardson, Brooks Davis, John Baldwin, David Chisnall, Jessica Clarke, Nathaniel Filardo, Simon W. Moore, Edward Napierala, Peter Sewell, Peter G. Neumann June 2020 15 JJ Thomson Avenue Cambridge CB3 0FD United Kingdom phone +44 1223 763500 https://www.cl.cam.ac.uk/ c 2020 Robert N. M. Watson, Alexander Richardson, Brooks Davis, John Baldwin, David Chisnall, Jessica Clarke, Nathaniel Filardo, Simon W. Moore, Edward Napierala, Peter Sewell, Peter G. Neumann, SRI International This work was supported by the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL), under contracts FA8750-10-C-0237 (“CTSRD”) and HR0011-18-C-0016 (“ECATS”). The views, opinions, and/or findings contained in this report are those of the authors and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government. This work was supported in part by the Innovate UK project Digital Security by Design (DSbD) Technology Platform Prototype, 105694. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 789108), ERC Advanced Grant ELVER. We also acknowledge the EPSRC REMS Programme Grant (EP/K008528/1), Arm Limited, HP Enterprise, and Google, Inc. Approved for Public Release, Distribution Unlimited. Technical reports published by the University of Cambridge Computer Laboratory are freely available via the Internet: https://www.cl.cam.ac.uk/techreports/ ISSN 1476-2986 3 Abstract This document is a brief introduction to the CHERI C/C++ programming languages.
    [Show full text]
  • Optimizing the Use of High Performance Software Libraries
    Optimizing the Use of High Performance Software Libraries Samuel Z. Guyer and Calvin Lin The University of Texas at Austin, Austin, TX 78712 Abstract. This paper describes how the use of software libraries, which is preva- lent in high performance computing, can benefit from compiler optimizations in much the same way that conventional computer languages do. We explain how the compilation of these informal languages differs from the compilation of more conventional computer languages. In particular, such compilation requires precise pointer analysis, requires domain-specific information about the library’s seman- tics, and requires a configurable compilation scheme. Finally, we show that the combination of dataflow analysis and pattern matching form a powerful tool for performing configurable optimizations. 1 Introduction High performance computing, and scientific computing in particular, relies heavily on software libraries. Libraries are attractive because they provide an easy mechanism for reusing code. Moreover, each library typically encapsulates a particular domain of ex- pertise, such as graphics or linear algebra, and the use of such libraries allows program- mers to think at a higher level of abstraction. In many ways, libraries are simply in- formal domain-specific languages whose only syntactic construct is the procedure call. This procedural interface is significant because it couches these informal languages in a familiar form, which is less imposing than a new computer language that introduces new syntax. Unfortunately, while libraries are not viewed as languages by users, they also are not viewed as languages by compilers. With a few exceptions, compilers treat each invocation of a library routine the same as any other procedure call.
    [Show full text]
  • SSC Reference Manual SDK 2017-1-17, Version 170
    SSC Reference Manual SDK 2017-1-17, Version 170 Aron Dobos and Paul Gilman January 13, 2017 The SSC (SAM Simulation Core) software library is the collection of simu- lation modules used by the National Renewable Energy Laboratory’s System Advisor Model (SAM). The SSC application programming interface (API) al- lows software developers to integrate SAM’s modules into web and desktop applications. The API includes mechanisms to set input variable values, run simulations, and retrieve values of output variables. The SSC modules are available in pre-compiled binary dynamic libraries minimum system library de- pendencies for Windows, OS X, and Linux operating systems. The native API language is ISO-standard C, and language wrappers are available for C#, Java, MATLAB, Python, and PHP. The API and model implementations are thread- safe and reentrant, allowing the software to be used in a parallel computing environment. This document describes the SSC software architecture, modules and data types, provides code samples, introduces SDKtool, and describes the language wrappers. Contents 2 Contents 1 Overview4 1.1 Framework Description.............................. 4 1.2 Modeling an Energy System........................... 4 2 An Example: Running PVWatts5 2.1 Write the program skeleton ........................... 6 2.2 Create the data container ............................ 6 2.3 Assign values to the module input variables.................. 7 2.4 Run the module.................................. 7 2.5 Retrieve results from the data container and display them.......... 8 2.6 Cleaning up.................................... 8 2.7 Compile and run the C program ........................ 9 2.8 Some additional comments............................ 9 3 Data Variables9 3.1 Variable Names.................................. 10 3.2 Variable Types .................................
    [Show full text]
  • Pygtk 2.0 Tutorial
    PyGTK 2.0 Tutorial John Finlay October 7, 2012 PyGTK 2.0 Tutorial by John Finlay Published March 2, 2006 ii Contents 1 Introduction 1 1.1 Exploring PyGTK . .2 2 Getting Started 5 2.1 Hello World in PyGTK . .7 2.2 Theory of Signals and Callbacks . .9 2.3 Events . 10 2.4 Stepping Through Hello World . 11 3 Moving On 15 3.1 More on Signal Handlers . 15 3.2 An Upgraded Hello World . 15 4 Packing Widgets 19 4.1 Theory of Packing Boxes . 19 4.2 Details of Boxes . 20 4.3 Packing Demonstration Program . 22 4.4 Packing Using Tables . 27 4.5 Table Packing Example . 28 5 Widget Overview 31 5.1 Widget Hierarchy . 31 5.2 Widgets Without Windows . 34 6 The Button Widget 35 6.1 Normal Buttons . 35 6.2 Toggle Buttons . 38 6.3 Check Buttons . 40 6.4 Radio Buttons . 42 7 Adjustments 45 7.1 Creating an Adjustment . 45 7.2 Using Adjustments the Easy Way . 45 7.3 Adjustment Internals . 46 8 Range Widgets 49 8.1 Scrollbar Widgets . 49 8.2 Scale Widgets . 49 8.2.1 Creating a Scale Widget . 49 8.2.2 Methods and Signals (well, methods, at least) . 50 8.3 Common Range Methods . 50 8.3.1 Setting the Update Policy . 50 8.3.2 Getting and Setting Adjustments . 51 8.4 Key and Mouse Bindings . 51 8.5 Range Widget Example . 51 9 Miscellaneous Widgets 57 9.1 Labels . 57 9.2 Arrows . 60 9.3 The Tooltips Object .
    [Show full text]
  • Javascript and OOP Once Upon a Time
    11/14/18 JavaScript and OOP Once upon a time . Jerry Cain CS 106AJ November 12, 2018 slides courtesy of Eric Roberts Object-Oriented Programming The most prestigious prize in computer science, the ACM Turing Award, was given in 2002 to two Norwegian computing pioneers, who jointly developed SIMULA in 1967, which was the first language to use the object-oriented paradigm. JavaScript and OOP In addition to his work in computer Kristen Nygaard science, Kristen Nygaard also took Ole Johan Dahl an active role in making sure that computers served the interests of workers, not just their employers. In 1990, Nygaard’s social activism was recognized in the form of the Norbert Wiener Award for Social and Professional Responsibility. Review: Clients and Interfaces David Parnas on Modular Development • Libraries—along with programming abstractions at many • David Parnas is Professor of Computer other levels—can be viewed from two perspectives. Code Science at Limerick University in Ireland, that uses a library is called a client. The code for the library where he directs the Software Quality itself is called the implementation. Research LaBoratory, and has also taught at universities in Germany, Canada, and the • The point at which the client and the implementation meet is United States. called the interface, which serves as both a barrier and a communication channel: • Parnas's most influential contriBution to software engineering is his groundBreaking 1972 paper "On the criteria to be used in decomposing systems into modules," which client implementation laid the foundation for modern structured programming. This paper appears in many anthologies and is available on the web at interface http://portal.acm.org/citation.cfm?id=361623 1 11/14/18 Information Hiding Thinking About Objects Our module structure is based on the decomposition criteria known as information hiding.
    [Show full text]
  • Chapter 4: PHP Fundamentals Objectives: After Reading This Chapter You Should Be Entertained and Learned: 1
    Chapter 4: PHP Fundamentals Objectives: After reading this chapter you should be entertained and learned: 1. What is PHP? 2. The Life-cycle of the PHP. 3. Why PHP? 4. Getting Started with PHP. 5. A Bit About Variables. 6. HTML Forms. 7. Building A Simple Application. 8. Programming Languages Statements Classification. 9. Program Development Life-Cycle. 10. Structured Programming. 11. Object Oriented Programming Language. 4.1 What is PHP? PHP is what is known as a Server-Side Scripting Language - which means that it is interpreted on the web server before the webpage is sent to a web browser to be displayed. This can be seen in the expanded PHP recursive acronym PHP-Hypertext Pre-processor (Created by Rasmus Lerdorf in 1995). This diagram outlines the process of how PHP interacts with a MySQL database. Originally, it is a set of Perl scripts known as the “Personal Home Page” tools. PHP 4.0 is released in June 1998 with some OO capability. The core was rewritten in 1998 for improved performance of complex applications. The core is rewritten in 1998 by Zeev and Andi and dubbed the “Zend Engine”. The engine is introduced in mid 1999 and is released with version 4.0 in May of 2000. Version 5.0 will include version 2.0 of the Zend Engine (New object model is more powerful and intuitive, Objects will no longer be passed by value; they now will be passed by reference, and Increases performance and makes OOP more attractive). Figure 4.1 The Relationship between the PHP, the Web Browser, and the MySQL DBMS 4.2 The Life-cycle of the PHP: A person using their web browser asks for a dynamic PHP webpage, the webserver has been configured to recognise the .php extension and passes the file to the PHP interpreter which in turn passes an SQL query to the MySQL server returning results for PHP to display.
    [Show full text]
  • Data Abstraction and Encapsulation ADT Example
    Data Abstraction and Encapsulation Definition: Data Encapsulation or Information Hiding is the concealing of the implementation details of a data object Data Abstraction from the outside world. Definition: Data Abstraction is the separation between and the specification of a data object and its implementation. Definition: A data type is a collection of objects and a set Encapsulation of operations that act on those objects. Definition: An abstract data type (ADT) is a data type that is organized in such a way that the specification of the objects and the specification of the operations on the objects is separated from the representation of the objects and the implementation of the operations. 1 2 Advantages of Data Abstraction and Data Encapsulation ADT Example ADT NaturalNumber is Simplification of software development objects: An ordered sub-range of the integers starting at zero and ending at the maximum integer (MAXINT) on the computer. Testing and Debugging functions: for all x, y belong to NaturalNumber; TRUE, FALSE belong to Boolean and where +, -, <, ==, and = are the usual integer operations Reusability Zero(): NaturalNumber ::= 0 Modifications to the representation of a data IsZero(x): Boolean ::= if (x == 0) IsZero = TRUE else IsZero = FALSE type Add(x, y): NaturalNumber ::= if (x+y <= MAXINT) Add = x + y else Add = MAXINT Equal(x, y): Boolean ::= if (x == y) Equal = TRUE else Equal = FALSE Successor(x): NaturalNumber ::= if (x == MAXINT) Successor = x else Successor = x +1 Substract(x, y): NaturalNumber ::= if (x <
    [Show full text]
  • A Proposal for a Revision of ISO Modula-2 2
    A Proposal for a Revision of ISO Modula-2 Benjamin Kowarsch, Modula-2 Software Foundation ∗ May 2020 (arXiv.org preprint) Abstract The Modula-2 language was first specified in [Wir78] by N. Wirth at ETH Z¨urich in 1978 and then revised several times. The last revision [Wir88] was published in 1988. The resulting language reports included ambiguities and lacked a comprehensive standard library. To resolve the ambiguities and specify a comprehensive standard library an ISO/IEC working group was formed and commenced work in 1987. A base standard was then ratified and published as IS 10514-1 in 1996 [JTC96]. Several conforming compilers have since been developed. At least five remain available of which at least three are actively maintained and one has been open sourced. Meanwhile, various deficiencies of the standard have become apparent but since its publication, no revision and no maintenance has been carried out. This paper discusses some of the deficiencies of IS 10514-1 and proposes a limited revision that could be carried out with moderate effort. The scope of the paper has been deliberately limited to the core language of the base standard and therefore excludes the standard library. 1 ISO/IEC Standardisation [II20] describes ISO/IEC standardisation procedures, summarised in brief below. 1.1 Organisational Overview The International Organisation for Standardisation (ISO) and the International Electrotechnical Commission (IEC) operate a joint technical committee (JTC1) to develop, publish and maintain information technology standards. JTC1 is organised into subcommittees (SC) and working groups (WG). The subcommittee for programming languages is JTC1/SC22. The working group that developed the ISO Modula-2 standard was JTC1/SC22/WG13.
    [Show full text]
  • C++: Data Abstraction & Classes
    Comp151 C++: Data Abstraction & Classes A Brief History of C++ • Bjarne Stroustrup of AT&T Bell Labs extended C with Simula-like classes in the early 1980s; the new language was called “C with Classes”. • C++, the successor of “C with Classes”, was designed by Stroustrup in 1986; version 2.0 was introduced in 1989. • The design of C++ was guided by three key principles: 1. The use of classes should not result in programs executing any more slowly than programs not using classes. 2. C programs should run as a subset of C++ programs. 3. No run-time inefficiency should be added to the language. Topics in Today's Lecture • Data Abstraction • Classes: Name Equivalence vs. Structural Equivalence • Classes: Restrictions on Data Members • Classes: Location of Function declarations/definitions • Classes: Member Access Control • Implementation of Class Objects Data Abstraction • Questions: – What is a ‘‘car''? – What is a ‘‘stack''? •A data abstraction is a simplified view of an object that includes only features one is interested in while hiding away the unnecessary details. • In programming languages, a data abstraction becomes an abstract data type (ADT) or a user-defined type. • In OOP, an ADT is implemented as a class. Example: Implement a Stack with an Array 1 data 2 3 size top . Example: Implement a Stack with a Linked List top -999 top (latest item) -999 1 2 Information Hiding •An abstract specification tells us the behavior of an object independent of its implementation. i.e. It tells us what an object does independent of how it works. • Information hiding is also known as data encapsulation, or representation independence.
    [Show full text]
  • Data Abstraction and Object-Oriented Languages; Ada Case Study
    CS323 Lecture: Data Abstraction and Object-Oriented Languages; Ada Case Study 2/4/09 Materials: Handout of Ada Demonstration Programs + Executable form (first_example, ascii_codes, rationals, stack, exception_demo, oo_demo) I. Introduction - ------------ A. As we have discussed previously, the earliest programming language paradigm (to which many languages belong) is the imperative or procedural paradigm. However, within this paradigm there have been some important developments which have led to modern languages being significantly better than older members of the same family, as well as to the development of the object-oriented paradigm. 1. One such development is "structured programming" which focussed on the control flow of programs. A major outcome of this work has been the presence, in most modern languages regardless of paradign, of certain well-defined control structures (conditional; definite and indefinite iteration). 2. A second major development has been the evolution of the idea of DATA ABSTRACTION. a. In the early days of computer science, most attention was given to algorithms and to language features to support them - e.g. procedural abstraction and control structures. b. In the late 1960's, Donald Knuth published a three volume work called "The Art of Computer Programming" which focussed attention on the fact that data structures are an equally important part of software design. i. His book is really responsible for the birth of the idea that a "Data Structures" course should be part of the CS curriculum. ii. Historically, coverage of data structures migrated from being an upper-level course in the CS major (300 level) to being a core part of the introductory sequence.
    [Show full text]
  • Designing Information Hiding Modularity for Model Transformation Languages
    Designing Information Hiding Modularity for Model Transformation Languages Andreas Rentschler Dominik Werle Qais Noorshams Lucia Happe Ralf Reussner Karlsruhe Institute of Technology (KIT) 76131 Karlsruhe, Germany {rentschler, noorshams, happe, reussner}@kit.edu [email protected] Abstract designed to transform models into other models and finally into code artifacts. These so-called transformation languages aim to ease Development and maintenance of model transformations make up a development efforts by offering a succinct syntax to query from and substantial share of the lifecycle costs of software products that rely map model elements between different modeling domains. on model-driven techniques. In particular large and heterogeneous However, development and maintenance of model transforma- models lead to poorly understandable transformation code due to tions themselves are expected to make up a substantial share of missing language concepts to master complexity. At the present time, the lifecycle costs of software products that rely on model-driven there exists no module concept for model transformation languages techniques [20]. Much of the effort in understanding model trans- that allows programmers to control information hiding and strictly formations arises from complexity induced by a high degree of declare model and code dependencies at module interfaces. Yet only data and control dependencies. Complexity is connected to size, then can we break down transformation logic into smaller parts, so structural complexity and heterogeneity of models involved in a that each part owns a clear interface for separating concerns. In this transformation. Visualization techniques can help to master this paper, we propose a module concept suitable for model transforma- complexity [17, 20].
    [Show full text]
  • Finding What C++ Lost: Tracking Referent Objects with GCC
    MEng Project Report Finding What C++ Lost: Tracking Referent Objects with GCC Alexander Lamaison Department of Computing Imperial College London <[email protected]> Supervisor: Paul H. J. Kelly <[email protected]> 16th June 2007 ii Abstract Pointer arithmetic in C and C++, though powerful, is inherently dangerous and a constant source of program flaws. Technologies exist to reduce these risks, one of which is Mudflap: an extension to GCC 4 that performs runtime bounds-checking. This has an number of weak- nesses, notably its inability to detect the case of pointers that stray between objects allocated adjacently. We have created MIRO, an improved version which uses the stricter approach of referent object tracking. As well as increasing the effectiveness of the checking, this allows us to provide more useful debugging information when violations occur. It requires no changes to existing C and C++ programs including non-compliant, but common, calculations involving invalid pointer values. Code compiled with MIRO can be mixed freely with unchecked code combining safety with performance and reducing the threshold to adoption. Acknowledgements I would like to thank, firstly, my supervisor and mentor Paul Kelly for guiding me through this project as well as the last four years of study and for always listening to my ramblings without complaint. Steven Lovegrove, my friend, in particular for helping me track down the source of the C++ STL problem but for help with many other problems too. Also John Wilander, Mariam Kamkar, Michael Zhivich, Tim Leek and Richard Lippmann for kindly donating the source code to their testbeds that were so invaluable throughout this project.
    [Show full text]